Overview

Brought to you by YData

Dataset statistics

Number of variables107
Number of observations836209
Missing cells36024469
Missing cells (%)40.3%
Total size in memory682.6 MiB
Average record size in memory856.0 B

Variable types

Text107

Dataset

DescriptionNaturalis Biodiversity Center (NL) - Botany 0061690-241126133413365
URLhttps://doi.org/10.15468/dl.4ze7ns

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "Naturalis Biodiversity Center" Constant
rightsHolder has constant value "Naturalis Biodiversity Center" Constant
institutionID has constant value "https://ror.org/0566bfb96" Constant
collectionCode has constant value "Botany" Constant
occurrenceStatus has constant value "PRESENT" Constant
sampleSizeValue has constant value "0.0 m" Constant
higherGeography has constant value "51.41942" Constant
latestEraOrHighestErathem has constant value "Bakker S" Constant
highestBiostratigraphicZone has constant value "2608920" Constant
identificationID has constant value "Physcia caesia (Hoffm.) Fürnr." Constant
identificationReferences has constant value "Fungi|Lichenes-Lecanoromycetes|Caliciales|Lichenes-Physciaceae" Constant
identificationVerificationStatus has constant value "Fungi" Constant
identificationRemarks has constant value "Ascomycota" Constant
taxonID has constant value "Lecanoromycetes" Constant
scientificNameID has constant value "Caliciales" Constant
parentNameUsageID has constant value "Physciaceae" Constant
taxonConceptID has constant value "Physcia" Constant
originalNameUsage has constant value "caesia" Constant
namePublishedInYear has constant value "SPECIES" Constant
subfamily has constant value "NL" Constant
tribe has constant value "2024-11-01T10:28:05.946Z" Constant
cultivarEpithet has constant value "true" Constant
verbatimTaxonRank has constant value "2608920" Constant
vernacularName has constant value "2608920" Constant
nomenclaturalStatus has constant value "180" Constant
taxonRemarks has constant value "10861608" Constant
elevation has constant value "2608920" Constant
elevationAccuracy has constant value "Physcia caesia" Constant
depth has constant value "Physcia caesia (Hoffm.) Fürnr." Constant
depthAccuracy has constant value "Physcia caesia (Hoffm.) Hampe ex Fürnr." Constant
typifiedName has constant value "NE" Constant
protocol has constant value "DWC_ARCHIVE" Constant
lastCrawled has constant value "2024-11-01T08:50:07.799Z" Constant
isSequenced has constant value "false" Constant
publishedByGbifRegion has constant value "EUROPE" Constant
otherCatalogNumbers has 626711 (74.9%) missing values Missing
eventDate has 143292 (17.1%) missing values Missing
startDayOfYear has 143292 (17.1%) missing values Missing
endDayOfYear has 143292 (17.1%) missing values Missing
year has 143292 (17.1%) missing values Missing
month has 176649 (21.1%) missing values Missing
day has 258263 (30.9%) missing values Missing
habitat has 686738 (82.1%) missing values Missing
sampleSizeValue has 836208 (> 99.9%) missing values Missing
higherGeography has 836208 (> 99.9%) missing values Missing
continent has 150752 (18.0%) missing values Missing
stateProvince has 512889 (61.3%) missing values Missing
locality has 123808 (14.8%) missing values Missing
verbatimElevation has 540040 (64.6%) missing values Missing
decimalLatitude has 483055 (57.8%) missing values Missing
decimalLongitude has 483055 (57.8%) missing values Missing
latestEraOrHighestErathem has 836208 (> 99.9%) missing values Missing
highestBiostratigraphicZone has 836208 (> 99.9%) missing values Missing
identificationID has 836208 (> 99.9%) missing values Missing
typeStatus has 822537 (98.4%) missing values Missing
identifiedBy has 693965 (83.0%) missing values Missing
dateIdentified has 763698 (91.3%) missing values Missing
identificationReferences has 836208 (> 99.9%) missing values Missing
identificationVerificationStatus has 836208 (> 99.9%) missing values Missing
identificationRemarks has 836208 (> 99.9%) missing values Missing
taxonID has 836208 (> 99.9%) missing values Missing
scientificNameID has 836208 (> 99.9%) missing values Missing
parentNameUsageID has 836208 (> 99.9%) missing values Missing
taxonConceptID has 836208 (> 99.9%) missing values Missing
originalNameUsage has 836208 (> 99.9%) missing values Missing
namePublishedInYear has 836208 (> 99.9%) missing values Missing
subfamily has 836208 (> 99.9%) missing values Missing
tribe has 836208 (> 99.9%) missing values Missing
genus has 13165 (1.6%) missing values Missing
genericName has 13241 (1.6%) missing values Missing
specificEpithet has 78237 (9.4%) missing values Missing
infraspecificEpithet has 778925 (93.1%) missing values Missing
cultivarEpithet has 836208 (> 99.9%) missing values Missing
verbatimTaxonRank has 836208 (> 99.9%) missing values Missing
vernacularName has 836208 (> 99.9%) missing values Missing
nomenclaturalStatus has 836208 (> 99.9%) missing values Missing
taxonRemarks has 836208 (> 99.9%) missing values Missing
elevation has 836208 (> 99.9%) missing values Missing
elevationAccuracy has 836208 (> 99.9%) missing values Missing
depth has 836208 (> 99.9%) missing values Missing
depthAccuracy has 836208 (> 99.9%) missing values Missing
distanceFromCentroidInMeters has 833143 (99.6%) missing values Missing
issue has 776215 (92.8%) missing values Missing
mediaType has 57645 (6.9%) missing values Missing
genusKey has 13165 (1.6%) missing values Missing
speciesKey has 78171 (9.3%) missing values Missing
species has 78171 (9.3%) missing values Missing
typifiedName has 836208 (> 99.9%) missing values Missing
gbifRegion has 151640 (18.1%) missing values Missing
level0Gid has 497950 (59.5%) missing values Missing
level0Name has 497950 (59.5%) missing values Missing
level1Gid has 499035 (59.7%) missing values Missing
level1Name has 499035 (59.7%) missing values Missing
level2Gid has 501994 (60.0%) missing values Missing
level2Name has 501999 (60.0%) missing values Missing
level3Gid has 695853 (83.2%) missing values Missing
level3Name has 697659 (83.4%) missing values Missing
iucnRedListCategory has 73123 (8.7%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 23:38:45.321100
Analysis finished2025-01-08 23:39:25.846555
Duration40.53 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct836209
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:26.277405image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters8362090
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique836209 ?
Unique (%)100.0%

Sample

1st row2514633172
2nd row2980371442
3rd row2514602651
4th row2980366433
5th row2514610075
ValueCountFrequency (%)
2514633172 1
 
< 0.1%
2980380439 1
 
< 0.1%
2980369451 1
 
< 0.1%
2514646162 1
 
< 0.1%
2980370447 1
 
< 0.1%
2514602651 1
 
< 0.1%
2980366433 1
 
< 0.1%
2514610075 1
 
< 0.1%
2980364432 1
 
< 0.1%
2516414075 1
 
< 0.1%
Other values (836199) 836199
> 99.9%
2025-01-08T18:39:26.796496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1484799
17.8%
2 1326518
15.9%
1 1318720
15.8%
4 696797
8.3%
6 681549
8.2%
3 673136
8.0%
7 644384
7.7%
0 519930
 
6.2%
8 508910
 
6.1%
9 507347
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8362090
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1484799
17.8%
2 1326518
15.9%
1 1318720
15.8%
4 696797
8.3%
6 681549
8.2%
3 673136
8.0%
7 644384
7.7%
0 519930
 
6.2%
8 508910
 
6.1%
9 507347
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 8362090
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 1484799
17.8%
2 1326518
15.9%
1 1318720
15.8%
4 696797
8.3%
6 681549
8.2%
3 673136
8.0%
7 644384
7.7%
0 519930
 
6.2%
8 508910
 
6.1%
9 507347
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8362090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1484799
17.8%
2 1326518
15.9%
1 1318720
15.8%
4 696797
8.3%
6 681549
8.2%
3 673136
8.0%
7 644384
7.7%
0 519930
 
6.2%
8 508910
 
6.1%
9 507347
 
6.1%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:26.858773image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters5853463
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 836209
100.0%
2025-01-08T18:39:26.944650image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1672418
28.6%
0 1672418
28.6%
_ 1672418
28.6%
1 836209
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2508627
42.9%
Uppercase Letter 1672418
28.6%
Connector Punctuation 1672418
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1672418
66.7%
1 836209
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 1672418
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1672418
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4181045
71.4%
Latin 1672418
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1672418
40.0%
_ 1672418
40.0%
1 836209
20.0%
Latin
ValueCountFrequency (%)
C 1672418
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5853463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1672418
28.6%
0 1672418
28.6%
_ 1672418
28.6%
1 836209
14.3%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:26.988650image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters24250061
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNaturalis Biodiversity Center
2nd rowNaturalis Biodiversity Center
3rd rowNaturalis Biodiversity Center
4th rowNaturalis Biodiversity Center
5th rowNaturalis Biodiversity Center
ValueCountFrequency (%)
naturalis 836209
33.3%
biodiversity 836209
33.3%
center 836209
33.3%
2025-01-08T18:39:27.080937image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3344836
13.8%
t 2508627
10.3%
r 2508627
10.3%
e 2508627
10.3%
1672418
 
6.9%
s 1672418
 
6.9%
a 1672418
 
6.9%
d 836209
 
3.4%
C 836209
 
3.4%
y 836209
 
3.4%
Other values (7) 5853463
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20069016
82.8%
Uppercase Letter 2508627
 
10.3%
Space Separator 1672418
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3344836
16.7%
t 2508627
12.5%
r 2508627
12.5%
e 2508627
12.5%
s 1672418
8.3%
a 1672418
8.3%
d 836209
 
4.2%
y 836209
 
4.2%
v 836209
 
4.2%
o 836209
 
4.2%
Other values (3) 2508627
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 836209
33.3%
N 836209
33.3%
B 836209
33.3%
Space Separator
ValueCountFrequency (%)
1672418
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22577643
93.1%
Common 1672418
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3344836
14.8%
t 2508627
11.1%
r 2508627
11.1%
e 2508627
11.1%
s 1672418
 
7.4%
a 1672418
 
7.4%
d 836209
 
3.7%
C 836209
 
3.7%
y 836209
 
3.7%
v 836209
 
3.7%
Other values (6) 5017254
22.2%
Common
ValueCountFrequency (%)
1672418
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24250061
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3344836
13.8%
t 2508627
10.3%
r 2508627
10.3%
e 2508627
10.3%
1672418
 
6.9%
s 1672418
 
6.9%
a 1672418
 
6.9%
d 836209
 
3.4%
C 836209
 
3.4%
y 836209
 
3.4%
Other values (7) 5853463
24.1%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:27.126640image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters24250061
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNaturalis Biodiversity Center
2nd rowNaturalis Biodiversity Center
3rd rowNaturalis Biodiversity Center
4th rowNaturalis Biodiversity Center
5th rowNaturalis Biodiversity Center
ValueCountFrequency (%)
naturalis 836209
33.3%
biodiversity 836209
33.3%
center 836209
33.3%
2025-01-08T18:39:27.217703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3344836
13.8%
t 2508627
10.3%
r 2508627
10.3%
e 2508627
10.3%
1672418
 
6.9%
s 1672418
 
6.9%
a 1672418
 
6.9%
d 836209
 
3.4%
C 836209
 
3.4%
y 836209
 
3.4%
Other values (7) 5853463
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20069016
82.8%
Uppercase Letter 2508627
 
10.3%
Space Separator 1672418
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3344836
16.7%
t 2508627
12.5%
r 2508627
12.5%
e 2508627
12.5%
s 1672418
8.3%
a 1672418
8.3%
d 836209
 
4.2%
y 836209
 
4.2%
v 836209
 
4.2%
o 836209
 
4.2%
Other values (3) 2508627
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 836209
33.3%
N 836209
33.3%
B 836209
33.3%
Space Separator
ValueCountFrequency (%)
1672418
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22577643
93.1%
Common 1672418
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3344836
14.8%
t 2508627
11.1%
r 2508627
11.1%
e 2508627
11.1%
s 1672418
 
7.4%
a 1672418
 
7.4%
d 836209
 
3.7%
C 836209
 
3.7%
y 836209
 
3.7%
v 836209
 
3.7%
Other values (6) 5017254
22.2%
Common
ValueCountFrequency (%)
1672418
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24250061
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3344836
13.8%
t 2508627
10.3%
r 2508627
10.3%
e 2508627
10.3%
1672418
 
6.9%
s 1672418
 
6.9%
a 1672418
 
6.9%
d 836209
 
3.4%
C 836209
 
3.4%
y 836209
 
3.4%
Other values (7) 5853463
24.1%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:27.265897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters20905225
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://ror.org/0566bfb96
2nd rowhttps://ror.org/0566bfb96
3rd rowhttps://ror.org/0566bfb96
4th rowhttps://ror.org/0566bfb96
5th rowhttps://ror.org/0566bfb96
ValueCountFrequency (%)
https://ror.org/0566bfb96 836209
100.0%
2025-01-08T18:39:27.360241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2508627
12.0%
r 2508627
12.0%
6 2508627
12.0%
t 1672418
 
8.0%
o 1672418
 
8.0%
b 1672418
 
8.0%
h 836209
 
4.0%
p 836209
 
4.0%
s 836209
 
4.0%
: 836209
 
4.0%
Other values (6) 5017254
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11706926
56.0%
Decimal Number 5017254
24.0%
Other Punctuation 4181045
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2508627
21.4%
t 1672418
14.3%
o 1672418
14.3%
b 1672418
14.3%
h 836209
 
7.1%
p 836209
 
7.1%
s 836209
 
7.1%
g 836209
 
7.1%
f 836209
 
7.1%
Decimal Number
ValueCountFrequency (%)
6 2508627
50.0%
0 836209
 
16.7%
5 836209
 
16.7%
9 836209
 
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 2508627
60.0%
: 836209
 
20.0%
. 836209
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11706926
56.0%
Common 9198299
44.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2508627
21.4%
t 1672418
14.3%
o 1672418
14.3%
b 1672418
14.3%
h 836209
 
7.1%
p 836209
 
7.1%
s 836209
 
7.1%
g 836209
 
7.1%
f 836209
 
7.1%
Common
ValueCountFrequency (%)
/ 2508627
27.3%
6 2508627
27.3%
: 836209
 
9.1%
. 836209
 
9.1%
0 836209
 
9.1%
5 836209
 
9.1%
9 836209
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20905225
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2508627
12.0%
r 2508627
12.0%
6 2508627
12.0%
t 1672418
 
8.0%
o 1672418
 
8.0%
b 1672418
 
8.0%
h 836209
 
4.0%
p 836209
 
4.0%
s 836209
 
4.0%
: 836209
 
4.0%
Other values (6) 5017254
24.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:27.399241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters5017254
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBotany
2nd rowBotany
3rd rowBotany
4th rowBotany
5th rowBotany
ValueCountFrequency (%)
botany 836209
100.0%
2025-01-08T18:39:27.484444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 836209
16.7%
o 836209
16.7%
t 836209
16.7%
a 836209
16.7%
n 836209
16.7%
y 836209
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4181045
83.3%
Uppercase Letter 836209
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 836209
20.0%
t 836209
20.0%
a 836209
20.0%
n 836209
20.0%
y 836209
20.0%
Uppercase Letter
ValueCountFrequency (%)
B 836209
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5017254
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 836209
16.7%
o 836209
16.7%
t 836209
16.7%
a 836209
16.7%
n 836209
16.7%
y 836209
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5017254
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 836209
16.7%
o 836209
16.7%
t 836209
16.7%
a 836209
16.7%
n 836209
16.7%
y 836209
16.7%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:27.531482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length17.99888066
Min length10

Characters and Unicode

Total characters15050826
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 836092
> 99.9%
occurrence 117
 
< 0.1%
2025-01-08T18:39:27.635406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 4180694
27.8%
R 1672418
11.1%
P 1672184
 
11.1%
S 1672184
 
11.1%
C 836443
 
5.6%
N 836209
 
5.6%
V 836092
 
5.6%
D 836092
 
5.6%
_ 836092
 
5.6%
I 836092
 
5.6%
Other values (3) 836326
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 14214734
94.4%
Connector Punctuation 836092
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 4180694
29.4%
R 1672418
11.8%
P 1672184
 
11.8%
S 1672184
 
11.8%
C 836443
 
5.9%
N 836209
 
5.9%
V 836092
 
5.9%
D 836092
 
5.9%
I 836092
 
5.9%
M 836092
 
5.9%
Other values (2) 234
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 836092
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14214734
94.4%
Common 836092
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 4180694
29.4%
R 1672418
11.8%
P 1672184
 
11.8%
S 1672184
 
11.8%
C 836443
 
5.9%
N 836209
 
5.9%
V 836092
 
5.9%
D 836092
 
5.9%
I 836092
 
5.9%
M 836092
 
5.9%
Other values (2) 234
 
< 0.1%
Common
ValueCountFrequency (%)
_ 836092
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15050826
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 4180694
27.8%
R 1672418
11.1%
P 1672184
 
11.1%
S 1672184
 
11.1%
C 836443
 
5.6%
N 836209
 
5.6%
V 836092
 
5.6%
D 836092
 
5.6%
_ 836092
 
5.6%
I 836092
 
5.6%
Other values (3) 836326
 
5.6%

occurrenceID
Text

Unique 

Distinct836209
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.4 MiB
2025-01-08T18:39:28.022199image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length81
Median length61
Mean length61.65443926
Min length48

Characters and Unicode

Total characters51555997
Distinct characters55
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique836209 ?
Unique (%)100.0%

Sample

1st rowhttps://data.biodiversitydata.nl/naturalis/specimen/L.2851604
2nd rowhttps://data.biodiversitydata.nl/naturalis/specimen/L%20%200971472
3rd rowhttps://data.biodiversitydata.nl/naturalis/specimen/L.2851644
4th rowhttps://data.biodiversitydata.nl/naturalis/specimen/L%20%200971531
5th rowhttps://data.biodiversitydata.nl/naturalis/specimen/L.2851686
ValueCountFrequency (%)
https://data.biodiversitydata.nl/naturalis/specimen/wag0100360 2
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.2851604 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.2852416 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l%20%200972015 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.2852067 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l%20%200971964 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.2851644 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l%20%200971531 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l.2851686 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/l%20%200971533 1
 
< 0.1%
Other values (836198) 836198
> 99.9%
2025-01-08T18:39:28.477816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5017261
 
9.7%
t 5017255
 
9.7%
i 4181045
 
8.1%
/ 4181044
 
8.1%
s 3344836
 
6.5%
n 2508627
 
4.9%
e 2508627
 
4.9%
d 2508627
 
4.9%
. 2447805
 
4.7%
l 1672421
 
3.2%
Other values (45) 18168449
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36793237
71.4%
Other Punctuation 7572035
 
14.7%
Decimal Number 6038903
 
11.7%
Uppercase Letter 1151820
 
2.2%
Connector Punctuation 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5017261
13.6%
t 5017255
13.6%
i 4181045
11.4%
s 3344836
9.1%
n 2508627
 
6.8%
e 2508627
 
6.8%
d 2508627
 
6.8%
l 1672421
 
4.5%
p 1672418
 
4.5%
r 1672418
 
4.5%
Other values (10) 6689702
18.2%
Uppercase Letter
ValueCountFrequency (%)
L 582349
50.6%
A 157806
 
13.7%
G 140829
 
12.2%
W 140821
 
12.2%
U 96045
 
8.3%
M 16975
 
1.5%
D 16972
 
1.5%
P 4
 
< 0.1%
N 3
 
< 0.1%
F 3
 
< 0.1%
Other values (8) 13
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 902344
14.9%
2 795873
13.2%
0 680107
11.3%
3 666803
11.0%
4 571191
9.5%
7 498967
8.3%
5 497652
8.2%
6 480049
7.9%
9 474458
7.9%
8 471459
7.8%
Other Punctuation
ValueCountFrequency (%)
/ 4181044
55.2%
. 2447805
32.3%
: 836209
 
11.0%
% 106968
 
1.4%
! 9
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37945057
73.6%
Common 13610940
 
26.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5017261
13.2%
t 5017255
13.2%
i 4181045
11.0%
s 3344836
8.8%
n 2508627
 
6.6%
e 2508627
 
6.6%
d 2508627
 
6.6%
l 1672421
 
4.4%
p 1672418
 
4.4%
r 1672418
 
4.4%
Other values (28) 7841522
20.7%
Common
ValueCountFrequency (%)
/ 4181044
30.7%
. 2447805
18.0%
1 902344
 
6.6%
: 836209
 
6.1%
2 795873
 
5.8%
0 680107
 
5.0%
3 666803
 
4.9%
4 571191
 
4.2%
7 498967
 
3.7%
5 497652
 
3.7%
Other values (7) 1532945
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 51555997
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5017261
 
9.7%
t 5017255
 
9.7%
i 4181045
 
8.1%
/ 4181044
 
8.1%
s 3344836
 
6.5%
n 2508627
 
4.9%
e 2508627
 
4.9%
d 2508627
 
4.9%
. 2447805
 
4.7%
l 1672421
 
3.2%
Other values (45) 18168449
35.2%
Distinct836208
Distinct (%)100.0%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:28.921193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length9
Mean length9.398614938
Min length8

Characters and Unicode

Total characters7859197
Distinct characters45
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique836208 ?
Unique (%)100.0%

Sample

1st rowL.2851604
2nd rowL 0971472
3rd rowL.2851644
4th rowL 0971531
5th rowL.2851686
ValueCountFrequency (%)
l 42432
 
4.8%
u 11055
 
1.2%
0001135 2
 
< 0.1%
0001034 2
 
< 0.1%
0000756 2
 
< 0.1%
0000796 2
 
< 0.1%
0000857 2
 
< 0.1%
0000899 2
 
< 0.1%
0000981 2
 
< 0.1%
0001074 2
 
< 0.1%
Other values (835366) 836198
94.0%
2025-01-08T18:39:29.420224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 902344
11.5%
. 775387
9.9%
2 688910
8.8%
3 666802
8.5%
L 582349
 
7.4%
0 573140
 
7.3%
4 571191
 
7.3%
7 498967
 
6.3%
5 497652
 
6.3%
6 480045
 
6.1%
Other values (35) 1622410
20.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5824968
74.1%
Uppercase Letter 1151819
 
14.7%
Other Punctuation 775397
 
9.9%
Space Separator 106963
 
1.4%
Lowercase Letter 44
 
< 0.1%
Modifier Symbol 4
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 582349
50.6%
A 157806
 
13.7%
G 140829
 
12.2%
W 140821
 
12.2%
U 96045
 
8.3%
M 16975
 
1.5%
D 16972
 
1.5%
P 4
 
< 0.1%
I 3
 
< 0.1%
N 3
 
< 0.1%
Other values (8) 12
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 902344
15.5%
2 688910
11.8%
3 666802
11.4%
0 573140
9.8%
4 571191
9.8%
7 498967
8.6%
5 497652
8.5%
6 480045
8.2%
9 474458
8.1%
8 471459
8.1%
Lowercase Letter
ValueCountFrequency (%)
w 17
38.6%
g 10
22.7%
a 7
15.9%
l 3
 
6.8%
o 2
 
4.5%
t 1
 
2.3%
u 1
 
2.3%
v 1
 
2.3%
n 1
 
2.3%
e 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 775387
> 99.9%
! 9
 
< 0.1%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
106963
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6707334
85.3%
Latin 1151863
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 582349
50.6%
A 157806
 
13.7%
G 140829
 
12.2%
W 140821
 
12.2%
U 96045
 
8.3%
M 16975
 
1.5%
D 16972
 
1.5%
w 17
 
< 0.1%
g 10
 
< 0.1%
a 7
 
< 0.1%
Other values (18) 32
 
< 0.1%
Common
ValueCountFrequency (%)
1 902344
13.5%
. 775387
11.6%
2 688910
10.3%
3 666802
9.9%
0 573140
8.5%
4 571191
8.5%
7 498967
7.4%
5 497652
7.4%
6 480045
7.2%
9 474458
7.1%
Other values (7) 578438
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7859197
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 902344
11.5%
. 775387
9.9%
2 688910
8.8%
3 666802
8.5%
L 582349
 
7.4%
0 573140
 
7.3%
4 571191
 
7.3%
7 498967
 
6.3%
5 497652
 
6.3%
6 480045
 
6.1%
Other values (35) 1622410
20.6%
Distinct572294
Distinct (%)68.4%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:29.635370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length110
Median length103
Mean length21.18251701
Min length2

Characters and Unicode

Total characters17712969
Distinct characters129
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique542010 ?
Unique (%)64.8%

Sample

1st rowUnknown s.n.
2nd rowZainoeddin bb 17357
3rd rowWijk, JH van s.n.
4th rowUnknown bb 17412
5th rowKoster, JT 6255
ValueCountFrequency (%)
s.n 256450
 
7.7%
van 68460
 
2.1%
unknown 67287
 
2.0%
de 50387
 
1.5%
a 44401
 
1.3%
j 43306
 
1.3%
m 26511
 
0.8%
h 23612
 
0.7%
p 23387
 
0.7%
r 23030
 
0.7%
Other values (92734) 2694093
81.1%
2025-01-08T18:39:29.909068image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2484713
 
14.0%
n 1099909
 
6.2%
e 1017309
 
5.7%
, 927788
 
5.2%
a 730231
 
4.1%
s 643605
 
3.6%
r 584399
 
3.3%
o 579526
 
3.3%
. 547214
 
3.1%
i 479904
 
2.7%
Other values (119) 8618371
48.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7937101
44.8%
Uppercase Letter 3138266
 
17.7%
Space Separator 2484717
 
14.0%
Decimal Number 2321671
 
13.1%
Other Punctuation 1748190
 
9.9%
Dash Punctuation 63539
 
0.4%
Open Punctuation 9522
 
0.1%
Close Punctuation 9517
 
0.1%
Connector Punctuation 352
 
< 0.1%
Math Symbol 52
 
< 0.1%
Other values (3) 42
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1099909
13.9%
e 1017309
12.8%
a 730231
9.2%
s 643605
 
8.1%
r 584399
 
7.4%
o 579526
 
7.3%
i 479904
 
6.0%
l 369214
 
4.7%
t 342007
 
4.3%
d 276655
 
3.5%
Other values (44) 1814342
22.9%
Uppercase Letter
ValueCountFrequency (%)
J 322743
 
10.3%
H 219232
 
7.0%
A 214963
 
6.8%
S 214864
 
6.8%
B 205943
 
6.6%
M 195533
 
6.2%
C 166837
 
5.3%
W 157895
 
5.0%
P 154576
 
4.9%
R 139022
 
4.4%
Other values (27) 1146658
36.5%
Other Punctuation
ValueCountFrequency (%)
, 927788
53.1%
. 547214
31.3%
; 258390
 
14.8%
/ 8084
 
0.5%
' 5032
 
0.3%
! 1088
 
0.1%
: 366
 
< 0.1%
? 114
 
< 0.1%
\ 42
 
< 0.1%
* 34
 
< 0.1%
Other values (3) 38
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 364437
15.7%
2 275734
11.9%
3 245370
10.6%
4 225943
9.7%
5 216525
9.3%
6 208526
9.0%
7 201503
8.7%
0 195678
8.4%
8 194815
8.4%
9 193140
8.3%
Open Punctuation
ValueCountFrequency (%)
( 9242
97.1%
[ 279
 
2.9%
{ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2484713
> 99.9%
  4
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 9239
97.1%
] 278
 
2.9%
Math Symbol
ValueCountFrequency (%)
+ 46
88.5%
= 6
 
11.5%
Other Number
ValueCountFrequency (%)
½ 15
88.2%
¼ 2
 
11.8%
Dash Punctuation
ValueCountFrequency (%)
- 63539
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 352
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 22
100.0%
Other Letter
ValueCountFrequency (%)
ª 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11075370
62.5%
Common 6637599
37.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1099909
 
9.9%
e 1017309
 
9.2%
a 730231
 
6.6%
s 643605
 
5.8%
r 584399
 
5.3%
o 579526
 
5.2%
i 479904
 
4.3%
l 369214
 
3.3%
t 342007
 
3.1%
J 322743
 
2.9%
Other values (82) 4906523
44.3%
Common
ValueCountFrequency (%)
2484713
37.4%
, 927788
 
14.0%
. 547214
 
8.2%
1 364437
 
5.5%
2 275734
 
4.2%
; 258390
 
3.9%
3 245370
 
3.7%
4 225943
 
3.4%
5 216525
 
3.3%
6 208526
 
3.1%
Other values (27) 882959
 
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17673878
99.8%
None 39091
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2484713
 
14.1%
n 1099909
 
6.2%
e 1017309
 
5.8%
, 927788
 
5.2%
a 730231
 
4.1%
s 643605
 
3.6%
r 584399
 
3.3%
o 579526
 
3.3%
. 547214
 
3.1%
i 479904
 
2.7%
Other values (75) 8579280
48.5%
None
ValueCountFrequency (%)
é 10707
27.4%
ü 7216
18.5%
ö 3633
 
9.3%
á 3193
 
8.2%
è 2754
 
7.0%
í 1869
 
4.8%
ñ 1745
 
4.5%
ß 1460
 
3.7%
ó 1358
 
3.5%
ë 879
 
2.2%
Other values (34) 4277
 
10.9%
Distinct48505
Distinct (%)5.8%
Missing1735
Missing (%)0.2%
Memory size6.4 MiB
2025-01-08T18:39:30.099200image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length98
Median length94
Mean length14.5336799
Min length1

Characters and Unicode

Total characters12127978
Distinct characters116
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21906 ?
Unique (%)2.6%

Sample

1st rowUnknown
2nd rowZainoeddin
3rd rowWijk JH van
4th rowUnknown
5th rowKoster JT
ValueCountFrequency (%)
van 68460
 
2.9%
unknown 67287
 
2.9%
de 50384
 
2.2%
j 43131
 
1.8%
a 35050
 
1.5%
m 25600
 
1.1%
h 23002
 
1.0%
r 22773
 
1.0%
al 22732
 
1.0%
p 22508
 
1.0%
Other values (25712) 1960338
83.7%
2025-01-08T18:39:30.351321image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1506829
 
12.4%
e 1015405
 
8.4%
n 838456
 
6.9%
a 723122
 
6.0%
r 581380
 
4.8%
o 577768
 
4.8%
i 477363
 
3.9%
s 386244
 
3.2%
l 367602
 
3.0%
t 341134
 
2.8%
Other values (106) 5312675
43.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7372185
60.8%
Uppercase Letter 2900716
 
23.9%
Space Separator 1506833
 
12.4%
Other Punctuation 289450
 
2.4%
Dash Punctuation 42507
 
0.4%
Decimal Number 7800
 
0.1%
Open Punctuation 4069
 
< 0.1%
Close Punctuation 4067
 
< 0.1%
Connector Punctuation 331
 
< 0.1%
Math Symbol 20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1015405
13.8%
n 838456
11.4%
a 723122
9.8%
r 581380
 
7.9%
o 577768
 
7.8%
i 477363
 
6.5%
s 386244
 
5.2%
l 367602
 
5.0%
t 341134
 
4.6%
d 272450
 
3.7%
Other values (43) 1791261
24.3%
Uppercase Letter
ValueCountFrequency (%)
J 322054
 
11.1%
H 212297
 
7.3%
A 192463
 
6.6%
M 192264
 
6.6%
S 185383
 
6.4%
B 184355
 
6.4%
C 162164
 
5.6%
W 152179
 
5.2%
R 130864
 
4.5%
P 129745
 
4.5%
Other values (27) 1036948
35.7%
Decimal Number
ValueCountFrequency (%)
1 1670
21.4%
9 1504
19.3%
6 974
12.5%
7 883
11.3%
4 725
9.3%
8 645
 
8.3%
0 378
 
4.8%
5 348
 
4.5%
2 339
 
4.3%
3 334
 
4.3%
Other Punctuation
ValueCountFrequency (%)
; 258389
89.3%
. 25959
 
9.0%
' 4957
 
1.7%
? 70
 
< 0.1%
/ 45
 
< 0.1%
& 20
 
< 0.1%
! 6
 
< 0.1%
¡ 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1506829
> 99.9%
  4
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 16
80.0%
= 4
 
20.0%
Dash Punctuation
ValueCountFrequency (%)
- 42507
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4069
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4067
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 331
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10272901
84.7%
Common 1855077
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1015405
 
9.9%
n 838456
 
8.2%
a 723122
 
7.0%
r 581380
 
5.7%
o 577768
 
5.6%
i 477363
 
4.6%
s 386244
 
3.8%
l 367602
 
3.6%
t 341134
 
3.3%
J 322054
 
3.1%
Other values (80) 4642373
45.2%
Common
ValueCountFrequency (%)
1506829
81.2%
; 258389
 
13.9%
- 42507
 
2.3%
. 25959
 
1.4%
' 4957
 
0.3%
( 4069
 
0.2%
) 4067
 
0.2%
1 1670
 
0.1%
9 1504
 
0.1%
6 974
 
0.1%
Other values (16) 4152
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12090377
99.7%
None 37601
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1506829
 
12.5%
e 1015405
 
8.4%
n 838456
 
6.9%
a 723122
 
6.0%
r 581380
 
4.8%
o 577768
 
4.8%
i 477363
 
3.9%
s 386244
 
3.2%
l 367602
 
3.0%
t 341134
 
2.8%
Other values (66) 5275074
43.6%
None
ValueCountFrequency (%)
é 10707
28.5%
ü 7216
19.2%
ö 3633
 
9.7%
á 3193
 
8.5%
è 2754
 
7.3%
í 1869
 
5.0%
ñ 1745
 
4.6%
ó 1358
 
3.6%
ë 879
 
2.3%
ä 815
 
2.2%
Other values (30) 3432
 
9.1%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:30.400319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters5853456
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 836208
100.0%
2025-01-08T18:39:30.487030image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1672416
28.6%
P 836208
14.3%
R 836208
14.3%
S 836208
14.3%
N 836208
14.3%
T 836208
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5853456
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1672416
28.6%
P 836208
14.3%
R 836208
14.3%
S 836208
14.3%
N 836208
14.3%
T 836208
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 5853456
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1672416
28.6%
P 836208
14.3%
R 836208
14.3%
S 836208
14.3%
N 836208
14.3%
T 836208
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5853456
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1672416
28.6%
P 836208
14.3%
R 836208
14.3%
S 836208
14.3%
N 836208
14.3%
T 836208
14.3%

otherCatalogNumbers
Text

Missing 

Distinct208546
Distinct (%)99.5%
Missing626711
Missing (%)74.9%
Memory size6.4 MiB
2025-01-08T18:39:30.647820image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length10
Mean length10.80295755
Min length1

Characters and Unicode

Total characters2263198
Distinct characters71
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207683 ?
Unique (%)99.1%

Sample

1st rowL 0215467
2nd rowL 0215532
3rd rowL 0204325
4th rowL 0542724
5th rowL 0973113
ValueCountFrequency (%)
l 106173
31.2%
u 21496
 
6.3%
b 684
 
0.2%
uw 595
 
0.2%
a 425
 
0.1%
fhow 27
 
< 0.1%
unw 22
 
< 0.1%
madw 20
 
< 0.1%
bw 19
 
< 0.1%
0 18
 
< 0.1%
Other values (208545) 210806
61.9%
2025-01-08T18:39:30.870064image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 343786
15.2%
258456
11.4%
1 177209
 
7.8%
2 165005
 
7.3%
3 153082
 
6.8%
9 149145
 
6.6%
4 140099
 
6.2%
5 138989
 
6.1%
8 135727
 
6.0%
6 131580
 
5.8%
Other values (61) 470120
20.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1656769
73.2%
Uppercase Letter 300499
 
13.3%
Space Separator 258456
 
11.4%
Other Punctuation 31261
 
1.4%
Lowercase Letter 16106
 
0.7%
Dash Punctuation 100
 
< 0.1%
Modifier Symbol 6
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 106259
35.4%
A 52483
17.5%
W 50540
16.8%
G 50437
16.8%
U 29934
 
10.0%
D 2506
 
0.8%
M 1441
 
0.5%
F 1003
 
0.3%
O 898
 
0.3%
B 884
 
0.3%
Other values (16) 4114
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
w 11100
68.9%
u 530
 
3.3%
e 496
 
3.1%
i 445
 
2.8%
a 402
 
2.5%
n 376
 
2.3%
j 348
 
2.2%
p 291
 
1.8%
t 281
 
1.7%
l 263
 
1.6%
Other values (14) 1574
 
9.8%
Decimal Number
ValueCountFrequency (%)
0 343786
20.8%
1 177209
10.7%
2 165005
10.0%
3 153082
9.2%
9 149145
9.0%
4 140099
8.5%
5 138989
8.4%
8 135727
 
8.2%
6 131580
 
7.9%
7 122147
 
7.4%
Other Punctuation
ValueCountFrequency (%)
; 30046
96.1%
. 1078
 
3.4%
: 105
 
0.3%
/ 29
 
0.1%
? 1
 
< 0.1%
' 1
 
< 0.1%
! 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
258456
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 100
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 6
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1946593
86.0%
Latin 316605
 
14.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 106259
33.6%
A 52483
16.6%
W 50540
16.0%
G 50437
15.9%
U 29934
 
9.5%
w 11100
 
3.5%
D 2506
 
0.8%
M 1441
 
0.5%
F 1003
 
0.3%
O 898
 
0.3%
Other values (40) 10004
 
3.2%
Common
ValueCountFrequency (%)
0 343786
17.7%
258456
13.3%
1 177209
9.1%
2 165005
8.5%
3 153082
7.9%
9 149145
7.7%
4 140099
7.2%
5 138989
7.1%
8 135727
 
7.0%
6 131580
 
6.8%
Other values (11) 153515
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2263198
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 343786
15.2%
258456
11.4%
1 177209
 
7.8%
2 165005
 
7.3%
3 153082
 
6.8%
9 149145
 
6.6%
4 140099
 
6.2%
5 138989
 
6.1%
8 135727
 
6.0%
6 131580
 
5.8%
Other values (61) 470120
20.8%

eventDate
Text

Missing 

Distinct55581
Distinct (%)8.0%
Missing143292
Missing (%)17.1%
Memory size6.4 MiB
2025-01-08T18:39:30.983451image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length11.82515511
Min length10

Characters and Unicode

Total characters8193851
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8468 ?
Unique (%)1.2%

Sample

1st row1933-04-24
2nd row1956-05-14
3rd row1939-05-21
4th row1955-04-26
5th row1838-05-01/1838-05-31
ValueCountFrequency (%)
1859-01-01/1859-12-31 870
 
0.1%
1857-01-01/1857-12-31 606
 
0.1%
1898-01-01/1898-12-31 535
 
0.1%
1922-10-01/1922-10-31 490
 
0.1%
1912-01-01/1912-12-31 463
 
0.1%
1900-01-01/1900-12-31 443
 
0.1%
1840-01-01/1840-12-31 438
 
0.1%
1909-01-01/1909-12-31 438
 
0.1%
1893-01-01/1893-12-31 434
 
0.1%
1880-01-01/1880-12-31 425
 
0.1%
Other values (55571) 687775
99.3%
2025-01-08T18:39:31.162272image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1643592
20.1%
- 1615776
19.7%
0 1247462
15.2%
9 943290
11.5%
2 517481
 
6.3%
8 437504
 
5.3%
3 388459
 
4.7%
6 361794
 
4.4%
7 360295
 
4.4%
5 319026
 
3.9%
Other values (2) 359172
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6463104
78.9%
Dash Punctuation 1615776
 
19.7%
Other Punctuation 114971
 
1.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1643592
25.4%
0 1247462
19.3%
9 943290
14.6%
2 517481
 
8.0%
8 437504
 
6.8%
3 388459
 
6.0%
6 361794
 
5.6%
7 360295
 
5.6%
5 319026
 
4.9%
4 244201
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 1615776
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 114971
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8193851
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1643592
20.1%
- 1615776
19.7%
0 1247462
15.2%
9 943290
11.5%
2 517481
 
6.3%
8 437504
 
5.3%
3 388459
 
4.7%
6 361794
 
4.4%
7 360295
 
4.4%
5 319026
 
3.9%
Other values (2) 359172
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8193851
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1643592
20.1%
- 1615776
19.7%
0 1247462
15.2%
9 943290
11.5%
2 517481
 
6.3%
8 437504
 
5.3%
3 388459
 
4.7%
6 361794
 
4.4%
7 360295
 
4.4%
5 319026
 
3.9%
Other values (2) 359172
 
4.4%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing143292
Missing (%)17.1%
Memory size6.4 MiB
2025-01-08T18:39:31.346804image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.711362256
Min length1

Characters and Unicode

Total characters1878749
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row114
2nd row135
3rd row141
4th row116
5th row121
ValueCountFrequency (%)
1 36360
 
5.2%
182 13626
 
2.0%
213 11511
 
1.7%
152 10863
 
1.6%
121 8743
 
1.3%
244 6422
 
0.9%
183 5707
 
0.8%
274 5575
 
0.8%
91 5567
 
0.8%
214 5131
 
0.7%
Other values (356) 583412
84.2%
2025-01-08T18:39:31.592893image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 425475
22.6%
2 367264
19.5%
3 221411
11.8%
4 136428
 
7.3%
5 135758
 
7.2%
8 123402
 
6.6%
0 120308
 
6.4%
6 118857
 
6.3%
9 117427
 
6.3%
7 112419
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1878749
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 425475
22.6%
2 367264
19.5%
3 221411
11.8%
4 136428
 
7.3%
5 135758
 
7.2%
8 123402
 
6.6%
0 120308
 
6.4%
6 118857
 
6.3%
9 117427
 
6.3%
7 112419
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1878749
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 425475
22.6%
2 367264
19.5%
3 221411
11.8%
4 136428
 
7.3%
5 135758
 
7.2%
8 123402
 
6.6%
0 120308
 
6.4%
6 118857
 
6.3%
9 117427
 
6.3%
7 112419
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1878749
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 425475
22.6%
2 367264
19.5%
3 221411
11.8%
4 136428
 
7.3%
5 135758
 
7.2%
8 123402
 
6.6%
0 120308
 
6.4%
6 118857
 
6.3%
9 117427
 
6.3%
7 112419
 
6.0%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing143292
Missing (%)17.1%
Memory size6.4 MiB
2025-01-08T18:39:31.781868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.819598884
Min length1

Characters and Unicode

Total characters1953748
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row114
2nd row135
3rd row141
4th row116
5th row151
ValueCountFrequency (%)
365 27667
 
4.0%
212 13371
 
1.9%
243 10946
 
1.6%
181 10761
 
1.6%
151 9144
 
1.3%
366 8805
 
1.3%
273 6217
 
0.9%
120 6124
 
0.9%
213 5878
 
0.8%
304 5369
 
0.8%
Other values (356) 588635
85.0%
2025-01-08T18:39:32.031889image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 391770
20.1%
2 363048
18.6%
3 263424
13.5%
6 158539
8.1%
5 156753
8.0%
4 142445
 
7.3%
0 126544
 
6.5%
8 120011
 
6.1%
9 117735
 
6.0%
7 113479
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1953748
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 391770
20.1%
2 363048
18.6%
3 263424
13.5%
6 158539
8.1%
5 156753
8.0%
4 142445
 
7.3%
0 126544
 
6.5%
8 120011
 
6.1%
9 117735
 
6.0%
7 113479
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 1953748
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 391770
20.1%
2 363048
18.6%
3 263424
13.5%
6 158539
8.1%
5 156753
8.0%
4 142445
 
7.3%
0 126544
 
6.5%
8 120011
 
6.1%
9 117735
 
6.0%
7 113479
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1953748
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 391770
20.1%
2 363048
18.6%
3 263424
13.5%
6 158539
8.1%
5 156753
8.0%
4 142445
 
7.3%
0 126544
 
6.5%
8 120011
 
6.1%
9 117735
 
6.0%
7 113479
 
5.8%

year
Text

Missing 

Distinct286
Distinct (%)< 0.1%
Missing143292
Missing (%)17.1%
Memory size6.4 MiB
2025-01-08T18:39:32.229286image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2771668
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)< 0.1%

Sample

1st row1933
2nd row1956
3rd row1939
4th row1955
5th row1838
ValueCountFrequency (%)
1969 12887
 
1.9%
1968 12780
 
1.8%
1966 12388
 
1.8%
1965 11988
 
1.7%
1967 11903
 
1.7%
1974 11118
 
1.6%
1964 11038
 
1.6%
1961 11015
 
1.6%
1972 10962
 
1.6%
1963 10940
 
1.6%
Other values (276) 575898
83.1%
2025-01-08T18:39:32.484488image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 762334
27.5%
9 729495
26.3%
8 225933
 
8.2%
6 191185
 
6.9%
7 173174
 
6.2%
0 161288
 
5.8%
5 156166
 
5.6%
2 140469
 
5.1%
3 123288
 
4.4%
4 108336
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2771668
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 762334
27.5%
9 729495
26.3%
8 225933
 
8.2%
6 191185
 
6.9%
7 173174
 
6.2%
0 161288
 
5.8%
5 156166
 
5.6%
2 140469
 
5.1%
3 123288
 
4.4%
4 108336
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Common 2771668
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 762334
27.5%
9 729495
26.3%
8 225933
 
8.2%
6 191185
 
6.9%
7 173174
 
6.2%
0 161288
 
5.8%
5 156166
 
5.6%
2 140469
 
5.1%
3 123288
 
4.4%
4 108336
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2771668
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 762334
27.5%
9 729495
26.3%
8 225933
 
8.2%
6 191185
 
6.9%
7 173174
 
6.2%
0 161288
 
5.8%
5 156166
 
5.6%
2 140469
 
5.1%
3 123288
 
4.4%
4 108336
 
3.9%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing176649
Missing (%)21.1%
Memory size6.4 MiB
2025-01-08T18:39:32.545588image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.186186245
Min length1

Characters and Unicode

Total characters782361
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row5
3rd row5
4th row4
5th row5
ValueCountFrequency (%)
7 93480
14.2%
6 79346
12.0%
8 76084
11.5%
5 71612
10.9%
9 56523
8.6%
4 53174
8.1%
10 49536
7.5%
11 42390
6.4%
3 42087
6.4%
2 33109
 
5.0%
Other values (2) 62219
9.4%
2025-01-08T18:39:32.737253image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 196535
25.1%
7 93480
11.9%
6 79346
10.1%
8 76084
 
9.7%
5 71612
 
9.2%
2 63984
 
8.2%
9 56523
 
7.2%
4 53174
 
6.8%
0 49536
 
6.3%
3 42087
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 782361
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 196535
25.1%
7 93480
11.9%
6 79346
10.1%
8 76084
 
9.7%
5 71612
 
9.2%
2 63984
 
8.2%
9 56523
 
7.2%
4 53174
 
6.8%
0 49536
 
6.3%
3 42087
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 782361
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 196535
25.1%
7 93480
11.9%
6 79346
10.1%
8 76084
 
9.7%
5 71612
 
9.2%
2 63984
 
8.2%
9 56523
 
7.2%
4 53174
 
6.8%
0 49536
 
6.3%
3 42087
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 782361
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 196535
25.1%
7 93480
11.9%
6 79346
10.1%
8 76084
 
9.7%
5 71612
 
9.2%
2 63984
 
8.2%
9 56523
 
7.2%
4 53174
 
6.8%
0 49536
 
6.3%
3 42087
 
5.4%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing258263
Missing (%)30.9%
Memory size6.4 MiB
2025-01-08T18:39:32.807252image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.716617123
Min length1

Characters and Unicode

Total characters992112
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row24
2nd row14
3rd row21
4th row26
5th row10
ValueCountFrequency (%)
20 21100
 
3.7%
10 20702
 
3.6%
15 20563
 
3.6%
12 20011
 
3.5%
18 19984
 
3.5%
25 19720
 
3.4%
22 19601
 
3.4%
23 19480
 
3.4%
17 19411
 
3.4%
14 19308
 
3.3%
Other values (21) 378066
65.4%
2025-01-08T18:39:32.926690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 261762
26.4%
2 248712
25.1%
3 82722
 
8.3%
5 58743
 
5.9%
0 58587
 
5.9%
8 57340
 
5.8%
7 56705
 
5.7%
6 56290
 
5.7%
4 56252
 
5.7%
9 54999
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 992112
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 261762
26.4%
2 248712
25.1%
3 82722
 
8.3%
5 58743
 
5.9%
0 58587
 
5.9%
8 57340
 
5.8%
7 56705
 
5.7%
6 56290
 
5.7%
4 56252
 
5.7%
9 54999
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 992112
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 261762
26.4%
2 248712
25.1%
3 82722
 
8.3%
5 58743
 
5.9%
0 58587
 
5.9%
8 57340
 
5.8%
7 56705
 
5.7%
6 56290
 
5.7%
4 56252
 
5.7%
9 54999
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 992112
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 261762
26.4%
2 248712
25.1%
3 82722
 
8.3%
5 58743
 
5.9%
0 58587
 
5.9%
8 57340
 
5.8%
7 56705
 
5.7%
6 56290
 
5.7%
4 56252
 
5.7%
9 54999
 
5.5%

habitat
Text

Missing 

Distinct85802
Distinct (%)57.4%
Missing686738
Missing (%)82.1%
Memory size6.4 MiB
2025-01-08T18:39:33.113076image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38739
Median length444
Mean length40.11896622
Min length1

Characters and Unicode

Total characters5996622
Distinct characters153
Distinct categories18 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70802 ?
Unique (%)47.4%

Sample

1st rowOld forest
2nd rowOld forest Very scanty
3rd rowOld forest, steep ridge
4th rowOld forest, clayey soil, sloping country, scanty
5th rowDegrade forest
ValueCountFrequency (%)
forest 69458
 
7.8%
in 32271
 
3.6%
on 27752
 
3.1%
of 15232
 
1.7%
soil 14658
 
1.6%
primary 13760
 
1.5%
with 12081
 
1.4%
secondary 11807
 
1.3%
the 11193
 
1.3%
along 10578
 
1.2%
Other values (37367) 672044
75.4%
2025-01-08T18:39:33.384696image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
740532
 
12.3%
e 583099
 
9.7%
r 430338
 
7.2%
a 415456
 
6.9%
o 409182
 
6.8%
n 340698
 
5.7%
s 329561
 
5.5%
t 306586
 
5.1%
i 299738
 
5.0%
l 234073
 
3.9%
Other values (143) 1907359
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4733336
78.9%
Space Separator 740532
 
12.3%
Other Punctuation 242637
 
4.0%
Uppercase Letter 221438
 
3.7%
Decimal Number 25398
 
0.4%
Dash Punctuation 13197
 
0.2%
Control 8046
 
0.1%
Open Punctuation 4942
 
0.1%
Close Punctuation 4924
 
0.1%
Math Symbol 1828
 
< 0.1%
Other values (8) 344
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 583099
12.3%
r 430338
 
9.1%
a 415456
 
8.8%
o 409182
 
8.6%
n 340698
 
7.2%
s 329561
 
7.0%
t 306586
 
6.5%
i 299738
 
6.3%
l 234073
 
4.9%
d 233984
 
4.9%
Other values (46) 1150621
24.3%
Uppercase Letter
ValueCountFrequency (%)
S 28522
12.9%
O 21270
 
9.6%
P 17531
 
7.9%
I 13516
 
6.1%
F 13391
 
6.0%
R 12633
 
5.7%
A 12186
 
5.5%
D 11776
 
5.3%
C 11713
 
5.3%
M 11182
 
5.0%
Other values (29) 67718
30.6%
Other Punctuation
ValueCountFrequency (%)
. 154594
63.7%
, 66676
27.5%
; 13156
 
5.4%
' 2774
 
1.1%
/ 2038
 
0.8%
: 1553
 
0.6%
& 617
 
0.3%
? 553
 
0.2%
" 368
 
0.2%
% 178
 
0.1%
Other values (7) 130
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 6268
24.7%
1 3980
15.7%
2 3363
13.2%
5 3066
12.1%
3 2053
 
8.1%
4 1898
 
7.5%
6 1305
 
5.1%
9 1273
 
5.0%
7 1102
 
4.3%
8 1090
 
4.3%
Math Symbol
ValueCountFrequency (%)
+ 1433
78.4%
± 152
 
8.3%
| 88
 
4.8%
= 70
 
3.8%
> 40
 
2.2%
< 36
 
2.0%
~ 8
 
0.4%
× 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 4598
93.0%
[ 341
 
6.9%
2
 
< 0.1%
{ 1
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 25
59.5%
² 16
38.1%
¼ 1
 
2.4%
Modifier Symbol
ValueCountFrequency (%)
` 7
70.0%
^ 2
 
20.0%
´ 1
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 13183
99.9%
14
 
0.1%
Control
ValueCountFrequency (%)
8010
99.6%
36
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 4589
93.2%
] 335
 
6.8%
Space Separator
ValueCountFrequency (%)
740532
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 186
100.0%
Other Symbol
ValueCountFrequency (%)
° 85
100.0%
Other Letter
ValueCountFrequency (%)
º 10
100.0%
Final Punctuation
ValueCountFrequency (%)
5
100.0%
Initial Punctuation
ValueCountFrequency (%)
5
100.0%
Currency Symbol
ValueCountFrequency (%)
£ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4954782
82.6%
Common 1041840
 
17.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 583099
11.8%
r 430338
 
8.7%
a 415456
 
8.4%
o 409182
 
8.3%
n 340698
 
6.9%
s 329561
 
6.7%
t 306586
 
6.2%
i 299738
 
6.0%
l 234073
 
4.7%
d 233984
 
4.7%
Other values (85) 1372067
27.7%
Common
ValueCountFrequency (%)
740532
71.1%
. 154594
 
14.8%
, 66676
 
6.4%
- 13183
 
1.3%
; 13156
 
1.3%
8010
 
0.8%
0 6268
 
0.6%
( 4598
 
0.4%
) 4589
 
0.4%
1 3980
 
0.4%
Other values (48) 26254
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5980939
99.7%
None 15655
 
0.3%
Punctuation 28
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
740532
 
12.4%
e 583099
 
9.7%
r 430338
 
7.2%
a 415456
 
6.9%
o 409182
 
6.8%
n 340698
 
5.7%
s 329561
 
5.5%
t 306586
 
5.1%
i 299738
 
5.0%
l 234073
 
3.9%
Other values (84) 1891676
31.6%
None
ValueCountFrequency (%)
é 4841
30.9%
ê 4190
26.8%
è 2304
14.7%
à 1106
 
7.1%
á 552
 
3.5%
ä 402
 
2.6%
ü 262
 
1.7%
í 240
 
1.5%
ú 191
 
1.2%
ó 183
 
1.2%
Other values (44) 1384
 
8.8%
Punctuation
ValueCountFrequency (%)
14
50.0%
5
 
17.9%
5
 
17.9%
2
 
7.1%
2
 
7.1%

sampleSizeValue
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:33.437372image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters4
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row0.0 m
ValueCountFrequency (%)
0.0 1
50.0%
m 1
50.0%
2025-01-08T18:39:33.527723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
40.0%
. 1
20.0%
1
20.0%
m 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
40.0%
Other Punctuation 1
20.0%
Space Separator 1
20.0%
Lowercase Letter 1
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
80.0%
Latin 1
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
50.0%
. 1
25.0%
1
25.0%
Latin
ValueCountFrequency (%)
m 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
40.0%
. 1
20.0%
1
20.0%
m 1
20.0%

higherGeography
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:33.566727image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row51.41942
ValueCountFrequency (%)
51.41942 1
100.0%
2025-01-08T18:39:33.653743image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
4 2
25.0%
5 1
12.5%
. 1
12.5%
9 1
12.5%
2 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
87.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
28.6%
4 2
28.6%
5 1
14.3%
9 1
14.3%
2 1
14.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
4 2
25.0%
5 1
12.5%
. 1
12.5%
9 1
12.5%
2 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
4 2
25.0%
5 1
12.5%
. 1
12.5%
9 1
12.5%
2 1
12.5%

continent
Text

Missing 

Distinct8
Distinct (%)< 0.1%
Missing150752
Missing (%)18.0%
Memory size6.4 MiB
2025-01-08T18:39:33.698677image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.548962225
Min length4

Characters and Unicode

Total characters4489032
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowEUROPE
2nd rowASIA
3rd rowEUROPE
4th rowASIA
5th rowEUROPE
ValueCountFrequency (%)
asia 209000
30.5%
europe 197368
28.8%
africa 113320
16.5%
south_america 65750
 
9.6%
oceania 60888
 
8.9%
north_america 38877
 
5.7%
antarctica 253
 
< 0.1%
3.69787 1
 
< 0.1%
2025-01-08T18:39:33.799207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 976429
21.8%
E 560251
12.5%
I 488088
10.9%
R 454445
10.1%
O 362883
 
8.1%
C 279341
 
6.2%
S 274750
 
6.1%
U 263118
 
5.9%
P 197368
 
4.4%
F 113320
 
2.5%
Other values (11) 519039
11.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4384398
97.7%
Connector Punctuation 104627
 
2.3%
Decimal Number 6
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 976429
22.3%
E 560251
12.8%
I 488088
11.1%
R 454445
10.4%
O 362883
 
8.3%
C 279341
 
6.4%
S 274750
 
6.3%
U 263118
 
6.0%
P 197368
 
4.5%
F 113320
 
2.6%
Other values (4) 414405
9.5%
Decimal Number
ValueCountFrequency (%)
7 2
33.3%
3 1
16.7%
6 1
16.7%
9 1
16.7%
8 1
16.7%
Connector Punctuation
ValueCountFrequency (%)
_ 104627
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4384398
97.7%
Common 104634
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 976429
22.3%
E 560251
12.8%
I 488088
11.1%
R 454445
10.4%
O 362883
 
8.3%
C 279341
 
6.4%
S 274750
 
6.3%
U 263118
 
6.0%
P 197368
 
4.5%
F 113320
 
2.6%
Other values (4) 414405
9.5%
Common
ValueCountFrequency (%)
_ 104627
> 99.9%
7 2
 
< 0.1%
3 1
 
< 0.1%
. 1
 
< 0.1%
6 1
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4489032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 976429
21.8%
E 560251
12.5%
I 488088
10.9%
R 454445
10.1%
O 362883
 
8.1%
C 279341
 
6.2%
S 274750
 
6.1%
U 263118
 
5.9%
P 197368
 
4.4%
F 113320
 
2.5%
Other values (11) 519039
11.6%
Distinct233
Distinct (%)< 0.1%
Missing3342
Missing (%)0.4%
Memory size6.4 MiB
2025-01-08T18:39:33.958332image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1665734
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowFR
2nd rowID
3rd rowFR
4th rowID
5th rowGR
ValueCountFrequency (%)
zz 149452
17.9%
nl 119066
 
14.3%
id 96368
 
11.6%
my 37231
 
4.5%
pg 26154
 
3.1%
br 18998
 
2.3%
fr 18916
 
2.3%
us 18613
 
2.2%
au 18586
 
2.2%
th 18570
 
2.2%
Other values (223) 310913
37.3%
2025-01-08T18:39:34.165196image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Z 324591
19.5%
N 152347
 
9.1%
L 133105
 
8.0%
I 124929
 
7.5%
D 113510
 
6.8%
M 73244
 
4.4%
G 71010
 
4.3%
C 67123
 
4.0%
P 64783
 
3.9%
R 63303
 
3.8%
Other values (16) 477789
28.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1665734
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Z 324591
19.5%
N 152347
 
9.1%
L 133105
 
8.0%
I 124929
 
7.5%
D 113510
 
6.8%
M 73244
 
4.4%
G 71010
 
4.3%
C 67123
 
4.0%
P 64783
 
3.9%
R 63303
 
3.8%
Other values (16) 477789
28.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1665734
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Z 324591
19.5%
N 152347
 
9.1%
L 133105
 
8.0%
I 124929
 
7.5%
D 113510
 
6.8%
M 73244
 
4.4%
G 71010
 
4.3%
C 67123
 
4.0%
P 64783
 
3.9%
R 63303
 
3.8%
Other values (16) 477789
28.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1665734
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Z 324591
19.5%
N 152347
 
9.1%
L 133105
 
8.0%
I 124929
 
7.5%
D 113510
 
6.8%
M 73244
 
4.4%
G 71010
 
4.3%
C 67123
 
4.0%
P 64783
 
3.9%
R 63303
 
3.8%
Other values (16) 477789
28.7%

stateProvince
Text

Missing 

Distinct2396
Distinct (%)0.7%
Missing512889
Missing (%)61.3%
Memory size6.4 MiB
2025-01-08T18:39:34.336846image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length27
Mean length8.84108623
Min length3

Characters and Unicode

Total characters2858500
Distinct characters101
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478 ?
Unique (%)0.1%

Sample

1st rowSumatra
2nd rowBorneo
3rd rowBorneo
4th rowSumatra
5th rowSumatra
ValueCountFrequency (%)
borneo 39582
 
9.7%
new 35029
 
8.6%
guinea 32739
 
8.0%
java 22932
 
5.6%
sumatra 14157
 
3.5%
region 13293
 
3.2%
northern 9291
 
2.3%
zuid-holland 8882
 
2.2%
gelderland 7192
 
1.8%
sulawesi 6585
 
1.6%
Other values (2460) 219524
53.6%
2025-01-08T18:39:34.581939image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 320355
 
11.2%
e 273195
 
9.6%
o 241553
 
8.5%
n 240670
 
8.4%
r 190437
 
6.7%
u 144597
 
5.1%
i 143886
 
5.0%
l 111277
 
3.9%
t 104424
 
3.7%
s 91117
 
3.2%
Other values (91) 996989
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2274891
79.6%
Uppercase Letter 448270
 
15.7%
Space Separator 85886
 
3.0%
Dash Punctuation 42062
 
1.5%
Open Punctuation 2935
 
0.1%
Close Punctuation 2908
 
0.1%
Other Punctuation 1494
 
0.1%
Decimal Number 37
 
< 0.1%
Final Punctuation 16
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 320355
14.1%
e 273195
12.0%
o 241553
10.6%
n 240670
10.6%
r 190437
8.4%
u 144597
 
6.4%
i 143886
 
6.3%
l 111277
 
4.9%
t 104424
 
4.6%
s 91117
 
4.0%
Other values (39) 413380
18.2%
Uppercase Letter
ValueCountFrequency (%)
N 66536
14.8%
S 52217
11.6%
B 50547
11.3%
G 46862
10.5%
J 23670
 
5.3%
M 21688
 
4.8%
L 20607
 
4.6%
H 19364
 
4.3%
R 16592
 
3.7%
C 14802
 
3.3%
Other values (21) 115385
25.7%
Decimal Number
ValueCountFrequency (%)
4 9
24.3%
7 7
18.9%
6 5
13.5%
5 5
13.5%
2 4
10.8%
3 4
10.8%
8 2
 
5.4%
1 1
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 867
58.0%
' 456
30.5%
, 158
 
10.6%
? 5
 
0.3%
& 5
 
0.3%
/ 3
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 42034
99.9%
28
 
0.1%
Space Separator
ValueCountFrequency (%)
85886
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2935
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2908
100.0%
Final Punctuation
ValueCountFrequency (%)
16
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2723161
95.3%
Common 135339
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 320355
 
11.8%
e 273195
 
10.0%
o 241553
 
8.9%
n 240670
 
8.8%
r 190437
 
7.0%
u 144597
 
5.3%
i 143886
 
5.3%
l 111277
 
4.1%
t 104424
 
3.8%
s 91117
 
3.3%
Other values (70) 861650
31.6%
Common
ValueCountFrequency (%)
85886
63.5%
- 42034
31.1%
( 2935
 
2.2%
) 2908
 
2.1%
. 867
 
0.6%
' 456
 
0.3%
, 158
 
0.1%
28
 
< 0.1%
16
 
< 0.1%
4 9
 
< 0.1%
Other values (11) 42
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2836154
99.2%
None 22302
 
0.8%
Punctuation 44
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 320355
 
11.3%
e 273195
 
9.6%
o 241553
 
8.5%
n 240670
 
8.5%
r 190437
 
6.7%
u 144597
 
5.1%
i 143886
 
5.1%
l 111277
 
3.9%
t 104424
 
3.7%
s 91117
 
3.2%
Other values (61) 974643
34.4%
None
ValueCountFrequency (%)
é 15488
69.4%
á 2406
 
10.8%
í 1047
 
4.7%
ô 639
 
2.9%
ó 629
 
2.8%
ü 572
 
2.6%
ä 289
 
1.3%
ã 266
 
1.2%
è 179
 
0.8%
ö 118
 
0.5%
Other values (18) 669
 
3.0%
Punctuation
ValueCountFrequency (%)
28
63.6%
16
36.4%

locality
Text

Missing 

Distinct529376
Distinct (%)74.3%
Missing123808
Missing (%)14.8%
Memory size6.4 MiB
2025-01-08T18:39:34.952214image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1249522
Median length342
Mean length47.85982052
Min length1

Characters and Unicode

Total characters34095384
Distinct characters203
Distinct categories19 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique472473 ?
Unique (%)66.3%

Sample

1st rowNice.
2nd rowE. Coast Sumatra, Siak, Indrapura
3rd rowCorsica; Cargèse.
4th rowPatras, op rots, bij ruine.
5th rowWest Borneo, Sintang G. Pahoe
ValueCountFrequency (%)
of 163058
 
3.2%
de 87146
 
1.7%
km 68383
 
1.4%
60966
 
1.2%
in 53973
 
1.1%
the 38597
 
0.8%
near 36540
 
0.7%
road 35833
 
0.7%
bij 34165
 
0.7%
district 32502
 
0.6%
Other values (379124) 4425190
87.9%
2025-01-08T18:39:35.374663image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4213362
 
12.4%
a 2982924
 
8.7%
e 2490880
 
7.3%
n 1922345
 
5.6%
o 1814720
 
5.3%
i 1762625
 
5.2%
r 1675236
 
4.9%
t 1309915
 
3.8%
. 1270912
 
3.7%
l 1158859
 
3.4%
Other values (193) 13493606
39.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22696643
66.6%
Space Separator 4213362
 
12.4%
Uppercase Letter 3524739
 
10.3%
Other Punctuation 2247204
 
6.6%
Decimal Number 709738
 
2.1%
Control 383114
 
1.1%
Dash Punctuation 136444
 
0.4%
Open Punctuation 80934
 
0.2%
Close Punctuation 80592
 
0.2%
Math Symbol 11289
 
< 0.1%
Other values (9) 11325
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2982924
13.1%
e 2490880
11.0%
n 1922345
 
8.5%
o 1814720
 
8.0%
i 1762625
 
7.8%
r 1675236
 
7.4%
t 1309915
 
5.8%
l 1158859
 
5.1%
s 1157830
 
5.1%
u 870000
 
3.8%
Other values (57) 5551309
24.5%
Uppercase Letter
ValueCountFrequency (%)
S 333610
 
9.5%
P 267959
 
7.6%
M 248778
 
7.1%
B 240852
 
6.8%
N 221270
 
6.3%
C 207032
 
5.9%
A 190780
 
5.4%
T 169441
 
4.8%
R 167041
 
4.7%
L 152962
 
4.3%
Other values (46) 1325014
37.6%
Other Punctuation
ValueCountFrequency (%)
. 1270912
56.6%
, 717111
31.9%
: 113198
 
5.0%
; 34269
 
1.5%
' 29687
 
1.3%
/ 23803
 
1.1%
! 19102
 
0.9%
* 18825
 
0.8%
" 10856
 
0.5%
? 5322
 
0.2%
Other values (10) 4119
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 130535
18.4%
0 99409
14.0%
2 99351
14.0%
5 70127
9.9%
4 67431
9.5%
3 63641
9.0%
6 53404
7.5%
7 47821
 
6.7%
8 41309
 
5.8%
9 36710
 
5.2%
Math Symbol
ValueCountFrequency (%)
| 3498
31.0%
± 3405
30.2%
= 2485
22.0%
> 896
 
7.9%
< 621
 
5.5%
+ 341
 
3.0%
× 28
 
0.2%
~ 14
 
0.1%
÷ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 73371
90.7%
[ 7435
 
9.2%
83
 
0.1%
30
 
< 0.1%
{ 15
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 907
75.3%
¼ 183
 
15.2%
¾ 87
 
7.2%
² 19
 
1.6%
³ 9
 
0.7%
Final Punctuation
ValueCountFrequency (%)
» 30
66.7%
8
 
17.8%
4
 
8.9%
3
 
6.7%
Initial Punctuation
ValueCountFrequency (%)
« 22
62.9%
9
25.7%
3
 
8.6%
1
 
2.9%
Modifier Symbol
ValueCountFrequency (%)
´ 15
45.5%
` 12
36.4%
^ 4
 
12.1%
¨ 2
 
6.1%
Dash Punctuation
ValueCountFrequency (%)
- 136425
> 99.9%
13
 
< 0.1%
6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 73201
90.8%
] 7384
 
9.2%
} 7
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 918
99.6%
® 3
 
0.3%
¦ 1
 
0.1%
Currency Symbol
ValueCountFrequency (%)
¢ 2
50.0%
¤ 1
25.0%
$ 1
25.0%
Control
ValueCountFrequency (%)
381396
99.6%
1718
 
0.4%
Other Letter
ValueCountFrequency (%)
º 204
97.6%
ª 5
 
2.4%
Space Separator
ValueCountFrequency (%)
4213362
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8870
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26221591
76.9%
Common 7873793
 
23.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2982924
 
11.4%
e 2490880
 
9.5%
n 1922345
 
7.3%
o 1814720
 
6.9%
i 1762625
 
6.7%
r 1675236
 
6.4%
t 1309915
 
5.0%
l 1158859
 
4.4%
s 1157830
 
4.4%
u 870000
 
3.3%
Other values (115) 9076257
34.6%
Common
ValueCountFrequency (%)
4213362
53.5%
. 1270912
 
16.1%
, 717111
 
9.1%
381396
 
4.8%
- 136425
 
1.7%
1 130535
 
1.7%
: 113198
 
1.4%
0 99409
 
1.3%
2 99351
 
1.3%
( 73371
 
0.9%
Other values (68) 638723
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33984659
99.7%
None 110534
 
0.3%
Punctuation 183
 
< 0.1%
Latin Ext Additional 6
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4213362
 
12.4%
a 2982924
 
8.8%
e 2490880
 
7.3%
n 1922345
 
5.7%
o 1814720
 
5.3%
i 1762625
 
5.2%
r 1675236
 
4.9%
t 1309915
 
3.9%
. 1270912
 
3.7%
l 1158859
 
3.4%
Other values (87) 13382881
39.4%
None
ValueCountFrequency (%)
é 38543
34.9%
á 8087
 
7.3%
è 7944
 
7.2%
ü 6159
 
5.6%
ö 4779
 
4.3%
í 4351
 
3.9%
ë 4307
 
3.9%
ä 3869
 
3.5%
ó 3731
 
3.4%
ê 3719
 
3.4%
Other values (79) 25045
22.7%
Punctuation
ValueCountFrequency (%)
83
45.4%
30
 
16.4%
13
 
7.1%
10
 
5.5%
9
 
4.9%
8
 
4.4%
8
 
4.4%
6
 
3.3%
5
 
2.7%
4
 
2.2%
Other values (3) 7
 
3.8%
Latin Ext Additional
ValueCountFrequency (%)
3
50.0%
2
33.3%
1
 
16.7%
Modifier Letters
ValueCountFrequency (%)
ˆ 2
100.0%

verbatimElevation
Text

Missing 

Distinct4412
Distinct (%)1.5%
Missing540040
Missing (%)64.6%
Memory size6.4 MiB
2025-01-08T18:39:35.549764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length5
Mean length6.284624657
Min length5

Characters and Unicode

Total characters1861311
Distinct characters14
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1520 ?
Unique (%)0.5%

Sample

1st row10.0 m
2nd row600.0 m
3rd row250.0 m
4th row20.0 m
5th row4.0 m
ValueCountFrequency (%)
m 296169
47.8%
0.0 166548
26.9%
13757
 
2.2%
100.0 4950
 
0.8%
200.0 4534
 
0.7%
50.0 4284
 
0.7%
300.0 3487
 
0.6%
400.0 3461
 
0.6%
500.0 3399
 
0.5%
1000.0 3080
 
0.5%
Other values (2344) 116183
 
18.7%
2025-01-08T18:39:35.787601image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 647657
34.8%
323683
17.4%
. 309926
16.7%
m 296169
15.9%
1 62594
 
3.4%
5 51541
 
2.8%
2 41404
 
2.2%
3 26542
 
1.4%
4 21812
 
1.2%
6 19042
 
1.0%
Other values (4) 60941
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 917406
49.3%
Space Separator 323683
 
17.4%
Other Punctuation 309926
 
16.7%
Lowercase Letter 296169
 
15.9%
Dash Punctuation 14127
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 647657
70.6%
1 62594
 
6.8%
5 51541
 
5.6%
2 41404
 
4.5%
3 26542
 
2.9%
4 21812
 
2.4%
6 19042
 
2.1%
7 18170
 
2.0%
8 15643
 
1.7%
9 13001
 
1.4%
Space Separator
ValueCountFrequency (%)
323683
100.0%
Other Punctuation
ValueCountFrequency (%)
. 309926
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 296169
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14127
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1565142
84.1%
Latin 296169
 
15.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 647657
41.4%
323683
20.7%
. 309926
19.8%
1 62594
 
4.0%
5 51541
 
3.3%
2 41404
 
2.6%
3 26542
 
1.7%
4 21812
 
1.4%
6 19042
 
1.2%
7 18170
 
1.2%
Other values (3) 42771
 
2.7%
Latin
ValueCountFrequency (%)
m 296169
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1861311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 647657
34.8%
323683
17.4%
. 309926
16.7%
m 296169
15.9%
1 62594
 
3.4%
5 51541
 
2.8%
2 41404
 
2.2%
3 26542
 
1.4%
4 21812
 
1.2%
6 19042
 
1.0%
Other values (4) 60941
 
3.3%

decimalLatitude
Text

Missing 

Distinct37379
Distinct (%)10.6%
Missing483055
Missing (%)57.8%
Memory size6.4 MiB
2025-01-08T18:39:35.990020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.959269894
Min length3

Characters and Unicode

Total characters2457694
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19749 ?
Unique (%)5.6%

Sample

1st row-2.06667
2nd row0.0
3rd row-2.18333
4th row-2.18333
5th row1.16667
ValueCountFrequency (%)
52.16011 2387
 
0.7%
7.25 1447
 
0.4%
5.83333 1431
 
0.4%
1.0 1312
 
0.4%
3.08333 1267
 
0.4%
6.08333 1210
 
0.3%
52.14714 1142
 
0.3%
51.83515 1140
 
0.3%
5.33333 1113
 
0.3%
5.38333 1100
 
0.3%
Other values (34950) 339605
96.2%
2025-01-08T18:39:36.247602image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 365142
14.9%
. 353154
14.4%
6 265229
10.8%
5 263570
10.7%
1 243547
9.9%
7 180077
7.3%
2 178855
7.3%
8 152951
6.2%
4 124787
 
5.1%
0 124225
 
5.1%
Other values (3) 206157
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2003589
81.5%
Other Punctuation 353154
 
14.4%
Dash Punctuation 100948
 
4.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 365142
18.2%
6 265229
13.2%
5 263570
13.2%
1 243547
12.2%
7 180077
9.0%
2 178855
8.9%
8 152951
7.6%
4 124787
 
6.2%
0 124225
 
6.2%
9 105206
 
5.3%
Other Punctuation
ValueCountFrequency (%)
. 353154
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 100948
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2457691
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 365142
14.9%
. 353154
14.4%
6 265229
10.8%
5 263570
10.7%
1 243547
9.9%
7 180077
7.3%
2 178855
7.3%
8 152951
6.2%
4 124787
 
5.1%
0 124225
 
5.1%
Other values (2) 206154
8.4%
Latin
ValueCountFrequency (%)
E 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2457694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 365142
14.9%
. 353154
14.4%
6 265229
10.8%
5 263570
10.7%
1 243547
9.9%
7 180077
7.3%
2 178855
7.3%
8 152951
6.2%
4 124787
 
5.1%
0 124225
 
5.1%
Other values (3) 206157
8.4%

decimalLongitude
Text

Missing 

Distinct43205
Distinct (%)12.2%
Missing483055
Missing (%)57.8%
Memory size6.4 MiB
2025-01-08T18:39:36.449797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length7.300126857
Min length3

Characters and Unicode

Total characters2578069
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21461 ?
Unique (%)6.1%

Sample

1st row100.93333
2nd row112.0
3rd row99.65
4th row99.65
5th row124.58333
ValueCountFrequency (%)
4.49701 2387
 
0.7%
10.41667 1222
 
0.3%
4.05 1206
 
0.3%
5.85874 1140
 
0.3%
3.01667 1134
 
0.3%
4.47406 1109
 
0.3%
4.90993 911
 
0.3%
4.32798 895
 
0.3%
4.47863 871
 
0.2%
106.7913 866
 
0.2%
Other values (41279) 341413
96.7%
2025-01-08T18:39:36.711446image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 358900
13.9%
. 353154
13.7%
1 351204
13.6%
6 304827
11.8%
7 199473
7.7%
5 198837
7.7%
4 184388
7.2%
8 155506
6.0%
9 144673
5.6%
0 144241
5.6%
Other values (2) 182866
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2180349
84.6%
Other Punctuation 353154
 
13.7%
Dash Punctuation 44566
 
1.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 358900
16.5%
1 351204
16.1%
6 304827
14.0%
7 199473
9.1%
5 198837
9.1%
4 184388
8.5%
8 155506
7.1%
9 144673
6.6%
0 144241
6.6%
2 138300
 
6.3%
Other Punctuation
ValueCountFrequency (%)
. 353154
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44566
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2578069
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 358900
13.9%
. 353154
13.7%
1 351204
13.6%
6 304827
11.8%
7 199473
7.7%
5 198837
7.7%
4 184388
7.2%
8 155506
6.0%
9 144673
5.6%
0 144241
5.6%
Other values (2) 182866
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2578069
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 358900
13.9%
. 353154
13.7%
1 351204
13.6%
6 304827
11.8%
7 199473
7.7%
5 198837
7.7%
4 184388
7.2%
8 155506
6.0%
9 144673
5.6%
0 144241
5.6%
Other values (2) 182866
7.1%

latestEraOrHighestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:36.766072image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowBakker S
ValueCountFrequency (%)
bakker 1
50.0%
s 1
50.0%
2025-01-08T18:39:36.853753image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
k 2
25.0%
B 1
12.5%
a 1
12.5%
e 1
12.5%
r 1
12.5%
1
12.5%
S 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
62.5%
Uppercase Letter 2
 
25.0%
Space Separator 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
k 2
40.0%
a 1
20.0%
e 1
20.0%
r 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
B 1
50.0%
S 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
87.5%
Common 1
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
k 2
28.6%
B 1
14.3%
a 1
14.3%
e 1
14.3%
r 1
14.3%
S 1
14.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
k 2
25.0%
B 1
12.5%
a 1
12.5%
e 1
12.5%
r 1
12.5%
1
12.5%
S 1
12.5%

highestBiostratigraphicZone
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:36.893753image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2608920
ValueCountFrequency (%)
2608920 1
100.0%
2025-01-08T18:39:36.980540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

identificationID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:37.023540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length30
Mean length30
Min length30

Characters and Unicode

Total characters30
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPhyscia caesia (Hoffm.) Fürnr.
ValueCountFrequency (%)
physcia 1
25.0%
caesia 1
25.0%
hoffm 1
25.0%
fürnr 1
25.0%
2025-01-08T18:39:37.115613image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
 
10.0%
3
 
10.0%
r 2
 
6.7%
s 2
 
6.7%
c 2
 
6.7%
i 2
 
6.7%
. 2
 
6.7%
f 2
 
6.7%
P 1
 
3.3%
m 1
 
3.3%
Other values (10) 10
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
66.7%
Space Separator 3
 
10.0%
Uppercase Letter 3
 
10.0%
Other Punctuation 2
 
6.7%
Close Punctuation 1
 
3.3%
Open Punctuation 1
 
3.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
15.0%
r 2
10.0%
s 2
10.0%
c 2
10.0%
i 2
10.0%
f 2
10.0%
m 1
 
5.0%
ü 1
 
5.0%
o 1
 
5.0%
h 1
 
5.0%
Other values (3) 3
15.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
F 1
33.3%
H 1
33.3%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
76.7%
Common 7
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
13.0%
r 2
 
8.7%
s 2
 
8.7%
c 2
 
8.7%
i 2
 
8.7%
f 2
 
8.7%
P 1
 
4.3%
m 1
 
4.3%
ü 1
 
4.3%
F 1
 
4.3%
Other values (6) 6
26.1%
Common
ValueCountFrequency (%)
3
42.9%
. 2
28.6%
) 1
 
14.3%
( 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29
96.7%
None 1
 
3.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
 
10.3%
3
 
10.3%
r 2
 
6.9%
s 2
 
6.9%
c 2
 
6.9%
i 2
 
6.9%
. 2
 
6.9%
f 2
 
6.9%
P 1
 
3.4%
m 1
 
3.4%
Other values (9) 9
31.0%
None
ValueCountFrequency (%)
ü 1
100.0%

typeStatus
Text

Missing 

Distinct13
Distinct (%)0.1%
Missing822537
Missing (%)98.4%
Memory size6.4 MiB
2025-01-08T18:39:37.165018image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length7.001828555
Min length4

Characters and Unicode

Total characters95729
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHOLOTYPE
2nd rowISOTYPE
3rd rowTYPE
4th rowLECTOTYPE
5th rowTYPE
ValueCountFrequency (%)
isotype 6179
45.2%
holotype 2289
 
16.7%
type 2175
 
15.9%
syntype 1373
 
10.0%
paratype 448
 
3.3%
isolectotype 443
 
3.2%
lectotype 436
 
3.2%
isosyntype 179
 
1.3%
neotype 83
 
0.6%
isoneotype 51
 
0.4%
Other values (3) 16
 
0.1%
2025-01-08T18:39:37.267394image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 15224
15.9%
E 14695
15.4%
T 14562
15.2%
P 14136
14.8%
O 12460
13.0%
S 8404
8.8%
I 6857
7.2%
L 3173
 
3.3%
H 2289
 
2.4%
N 1686
 
1.8%
Other values (3) 2243
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 95729
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 15224
15.9%
E 14695
15.4%
T 14562
15.2%
P 14136
14.8%
O 12460
13.0%
S 8404
8.8%
I 6857
7.2%
L 3173
 
3.3%
H 2289
 
2.4%
N 1686
 
1.8%
Other values (3) 2243
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 95729
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 15224
15.9%
E 14695
15.4%
T 14562
15.2%
P 14136
14.8%
O 12460
13.0%
S 8404
8.8%
I 6857
7.2%
L 3173
 
3.3%
H 2289
 
2.4%
N 1686
 
1.8%
Other values (3) 2243
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95729
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 15224
15.9%
E 14695
15.4%
T 14562
15.2%
P 14136
14.8%
O 12460
13.0%
S 8404
8.8%
I 6857
7.2%
L 3173
 
3.3%
H 2289
 
2.4%
N 1686
 
1.8%
Other values (3) 2243
 
2.3%

identifiedBy
Text

Missing 

Distinct6403
Distinct (%)4.5%
Missing693965
Missing (%)83.0%
Memory size6.4 MiB
2025-01-08T18:39:37.440102image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length66
Median length48
Mean length11.41105424
Min length1

Characters and Unicode

Total characters1623154
Distinct characters101
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2525 ?
Unique (%)1.8%

Sample

1st rowWood GHS
2nd rowSteenis CGGJ van
3rd rowPereira JT; Wong KM
4th rowAshton PS
5th rowNooteboom HP
ValueCountFrequency (%)
van 14889
 
4.5%
de 8088
 
2.4%
der 4298
 
1.3%
p 4266
 
1.3%
a 4145
 
1.2%
j 3993
 
1.2%
maas 3807
 
1.1%
pc 3671
 
1.1%
d 3624
 
1.1%
cch 3447
 
1.0%
Other values (5846) 278934
83.7%
2025-01-08T18:39:37.694749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
190918
 
11.8%
e 147790
 
9.1%
n 106027
 
6.5%
a 98234
 
6.1%
r 73830
 
4.5%
o 68233
 
4.2%
J 57762
 
3.6%
i 54850
 
3.4%
s 53134
 
3.3%
l 50757
 
3.1%
Other values (91) 721619
44.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 953756
58.8%
Uppercase Letter 462951
28.5%
Space Separator 190918
 
11.8%
Other Punctuation 9766
 
0.6%
Dash Punctuation 5566
 
0.3%
Open Punctuation 90
 
< 0.1%
Close Punctuation 90
 
< 0.1%
Decimal Number 16
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 147790
15.5%
n 106027
11.1%
a 98234
10.3%
r 73830
 
7.7%
o 68233
 
7.2%
i 54850
 
5.8%
s 53134
 
5.6%
l 50757
 
5.3%
d 45593
 
4.8%
t 33486
 
3.5%
Other values (39) 221822
23.3%
Uppercase Letter
ValueCountFrequency (%)
J 57762
12.5%
C 39019
 
8.4%
M 37667
 
8.1%
H 34790
 
7.5%
A 32419
 
7.0%
P 28784
 
6.2%
S 27669
 
6.0%
B 26226
 
5.7%
W 25662
 
5.5%
L 17495
 
3.8%
Other values (23) 135458
29.3%
Other Punctuation
ValueCountFrequency (%)
; 9038
92.5%
. 504
 
5.2%
' 174
 
1.8%
! 43
 
0.4%
? 5
 
0.1%
& 1
 
< 0.1%
: 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 5
31.2%
9 3
18.8%
6 3
18.8%
4 2
 
12.5%
0 1
 
6.2%
3 1
 
6.2%
5 1
 
6.2%
Space Separator
ValueCountFrequency (%)
190918
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5566
100.0%
Open Punctuation
ValueCountFrequency (%)
( 90
100.0%
Close Punctuation
ValueCountFrequency (%)
) 90
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1416707
87.3%
Common 206447
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 147790
 
10.4%
n 106027
 
7.5%
a 98234
 
6.9%
r 73830
 
5.2%
o 68233
 
4.8%
J 57762
 
4.1%
i 54850
 
3.9%
s 53134
 
3.8%
l 50757
 
3.6%
d 45593
 
3.2%
Other values (72) 660497
46.6%
Common
ValueCountFrequency (%)
190918
92.5%
; 9038
 
4.4%
- 5566
 
2.7%
. 504
 
0.2%
' 174
 
0.1%
( 90
 
< 0.1%
) 90
 
< 0.1%
! 43
 
< 0.1%
? 5
 
< 0.1%
1 5
 
< 0.1%
Other values (9) 14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1620015
99.8%
None 3139
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
190918
 
11.8%
e 147790
 
9.1%
n 106027
 
6.5%
a 98234
 
6.1%
r 73830
 
4.6%
o 68233
 
4.2%
J 57762
 
3.6%
i 54850
 
3.4%
s 53134
 
3.3%
l 50757
 
3.1%
Other values (61) 718480
44.4%
None
ValueCountFrequency (%)
é 968
30.8%
á 650
20.7%
í 358
 
11.4%
ö 307
 
9.8%
ü 216
 
6.9%
ñ 87
 
2.8%
è 84
 
2.7%
ä 71
 
2.3%
ó 61
 
1.9%
õ 49
 
1.6%
Other values (20) 288
 
9.2%

dateIdentified
Text

Missing 

Distinct8706
Distinct (%)12.0%
Missing763698
Missing (%)91.3%
Memory size6.4 MiB
2025-01-08T18:39:37.804940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters1377709
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3788 ?
Unique (%)5.2%

Sample

1st row1956-11-22T00:00:00
2nd row1995-09-27T00:00:00
3rd row1968-07-01T00:00:00
4th row1972-06-01T00:00:00
5th row1957-01-18T00:00:00
ValueCountFrequency (%)
1955-03-01t00:00:00 346
 
0.5%
1968-07-01t00:00:00 330
 
0.5%
1972-06-01t00:00:00 328
 
0.5%
1995-10-01t00:00:00 275
 
0.4%
2001-12-01t00:00:00 275
 
0.4%
1989-08-01t00:00:00 248
 
0.3%
2000-01-01t00:00:00 236
 
0.3%
2000-06-01t00:00:00 233
 
0.3%
1989-04-01t00:00:00 222
 
0.3%
2001-03-01t00:00:00 221
 
0.3%
Other values (8696) 69797
96.3%
2025-01-08T18:39:37.969573image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 616792
44.8%
1 146391
 
10.6%
- 145022
 
10.5%
: 145022
 
10.5%
T 72511
 
5.3%
9 65162
 
4.7%
2 65099
 
4.7%
8 23958
 
1.7%
7 22083
 
1.6%
5 20282
 
1.5%
Other values (3) 55387
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1015154
73.7%
Dash Punctuation 145022
 
10.5%
Other Punctuation 145022
 
10.5%
Uppercase Letter 72511
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 616792
60.8%
1 146391
 
14.4%
9 65162
 
6.4%
2 65099
 
6.4%
8 23958
 
2.4%
7 22083
 
2.2%
5 20282
 
2.0%
6 20115
 
2.0%
3 18157
 
1.8%
4 17115
 
1.7%
Dash Punctuation
ValueCountFrequency (%)
- 145022
100.0%
Other Punctuation
ValueCountFrequency (%)
: 145022
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 72511
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1305198
94.7%
Latin 72511
 
5.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 616792
47.3%
1 146391
 
11.2%
- 145022
 
11.1%
: 145022
 
11.1%
9 65162
 
5.0%
2 65099
 
5.0%
8 23958
 
1.8%
7 22083
 
1.7%
5 20282
 
1.6%
6 20115
 
1.5%
Other values (2) 35272
 
2.7%
Latin
ValueCountFrequency (%)
T 72511
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1377709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 616792
44.8%
1 146391
 
10.6%
- 145022
 
10.5%
: 145022
 
10.5%
T 72511
 
5.3%
9 65162
 
4.7%
2 65099
 
4.7%
8 23958
 
1.7%
7 22083
 
1.6%
5 20282
 
1.5%
Other values (3) 55387
 
4.0%

identificationReferences
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:38.036871image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length62
Mean length62
Min length62

Characters and Unicode

Total characters62
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowFungi|Lichenes-Lecanoromycetes|Caliciales|Lichenes-Physciaceae
ValueCountFrequency (%)
fungi|lichenes-lecanoromycetes|caliciales|lichenes-physciaceae 1
100.0%
2025-01-08T18:39:38.139656image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 10
16.1%
c 7
11.3%
i 6
9.7%
s 5
 
8.1%
a 5
 
8.1%
n 4
 
6.5%
| 3
 
4.8%
L 3
 
4.8%
h 3
 
4.8%
y 2
 
3.2%
Other values (11) 14
22.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51
82.3%
Uppercase Letter 6
 
9.7%
Math Symbol 3
 
4.8%
Dash Punctuation 2
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10
19.6%
c 7
13.7%
i 6
11.8%
s 5
9.8%
a 5
9.8%
n 4
 
7.8%
h 3
 
5.9%
y 2
 
3.9%
l 2
 
3.9%
o 2
 
3.9%
Other values (5) 5
9.8%
Uppercase Letter
ValueCountFrequency (%)
L 3
50.0%
C 1
 
16.7%
F 1
 
16.7%
P 1
 
16.7%
Math Symbol
ValueCountFrequency (%)
| 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 57
91.9%
Common 5
 
8.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10
17.5%
c 7
12.3%
i 6
10.5%
s 5
8.8%
a 5
8.8%
n 4
 
7.0%
L 3
 
5.3%
h 3
 
5.3%
y 2
 
3.5%
l 2
 
3.5%
Other values (9) 10
17.5%
Common
ValueCountFrequency (%)
| 3
60.0%
- 2
40.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 10
16.1%
c 7
11.3%
i 6
9.7%
s 5
 
8.1%
a 5
 
8.1%
n 4
 
6.5%
| 3
 
4.8%
L 3
 
4.8%
h 3
 
4.8%
y 2
 
3.2%
Other values (11) 14
22.6%

identificationVerificationStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:38.177332image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowFungi
ValueCountFrequency (%)
fungi 1
100.0%
2025-01-08T18:39:38.266770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
F 1
20.0%
u 1
20.0%
n 1
20.0%
g 1
20.0%
i 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
80.0%
Uppercase Letter 1
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1
25.0%
n 1
25.0%
g 1
25.0%
i 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
F 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 1
20.0%
u 1
20.0%
n 1
20.0%
g 1
20.0%
i 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 1
20.0%
u 1
20.0%
n 1
20.0%
g 1
20.0%
i 1
20.0%

identificationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:38.305770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAscomycota
ValueCountFrequency (%)
ascomycota 1
100.0%
2025-01-08T18:39:38.482835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2
20.0%
o 2
20.0%
A 1
10.0%
s 1
10.0%
m 1
10.0%
y 1
10.0%
t 1
10.0%
a 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 2
22.2%
o 2
22.2%
s 1
11.1%
m 1
11.1%
y 1
11.1%
t 1
11.1%
a 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 2
20.0%
o 2
20.0%
A 1
10.0%
s 1
10.0%
m 1
10.0%
y 1
10.0%
t 1
10.0%
a 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2
20.0%
o 2
20.0%
A 1
10.0%
s 1
10.0%
m 1
10.0%
y 1
10.0%
t 1
10.0%
a 1
10.0%

taxonID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:38.529044image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters15
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowLecanoromycetes
ValueCountFrequency (%)
lecanoromycetes 1
100.0%
2025-01-08T18:39:38.620047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
20.0%
c 2
13.3%
o 2
13.3%
L 1
 
6.7%
a 1
 
6.7%
n 1
 
6.7%
r 1
 
6.7%
m 1
 
6.7%
y 1
 
6.7%
t 1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
93.3%
Uppercase Letter 1
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
21.4%
c 2
14.3%
o 2
14.3%
a 1
 
7.1%
n 1
 
7.1%
r 1
 
7.1%
m 1
 
7.1%
y 1
 
7.1%
t 1
 
7.1%
s 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
L 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
20.0%
c 2
13.3%
o 2
13.3%
L 1
 
6.7%
a 1
 
6.7%
n 1
 
6.7%
r 1
 
6.7%
m 1
 
6.7%
y 1
 
6.7%
t 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3
20.0%
c 2
13.3%
o 2
13.3%
L 1
 
6.7%
a 1
 
6.7%
n 1
 
6.7%
r 1
 
6.7%
m 1
 
6.7%
y 1
 
6.7%
t 1
 
6.7%

scientificNameID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:38.661523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCaliciales
ValueCountFrequency (%)
caliciales 1
100.0%
2025-01-08T18:39:38.756529image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
20.0%
l 2
20.0%
i 2
20.0%
C 1
10.0%
c 1
10.0%
e 1
10.0%
s 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
90.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
22.2%
l 2
22.2%
i 2
22.2%
c 1
11.1%
e 1
11.1%
s 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
20.0%
l 2
20.0%
i 2
20.0%
C 1
10.0%
c 1
10.0%
e 1
10.0%
s 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
20.0%
l 2
20.0%
i 2
20.0%
C 1
10.0%
c 1
10.0%
e 1
10.0%
s 1
10.0%
Distinct126970
Distinct (%)15.2%
Missing48
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:38.967754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.962477322
Min length1

Characters and Unicode

Total characters5821752
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51536 ?
Unique (%)6.2%

Sample

1st row3189695
2nd row4097456
3rd row3189695
4th row5284426
5th row3189695
ValueCountFrequency (%)
6 3615
 
0.4%
329 1484
 
0.2%
3177662 1278
 
0.2%
2919963 968
 
0.1%
9458333 756
 
0.1%
3189556 710
 
0.1%
3061139 634
 
0.1%
3029010 607
 
0.1%
3033976 605
 
0.1%
3065 604
 
0.1%
Other values (126960) 824900
98.7%
2025-01-08T18:39:39.242984image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5821752
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

Most occurring scripts

ValueCountFrequency (%)
Common 5821752
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5821752
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

parentNameUsageID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:39.300982image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPhysciaceae
ValueCountFrequency (%)
physciaceae 1
100.0%
2025-01-08T18:39:39.389456image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2
18.2%
a 2
18.2%
e 2
18.2%
P 1
9.1%
h 1
9.1%
y 1
9.1%
s 1
9.1%
i 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
90.9%
Uppercase Letter 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 2
20.0%
a 2
20.0%
e 2
20.0%
h 1
10.0%
y 1
10.0%
s 1
10.0%
i 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 2
18.2%
a 2
18.2%
e 2
18.2%
P 1
9.1%
h 1
9.1%
y 1
9.1%
s 1
9.1%
i 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2
18.2%
a 2
18.2%
e 2
18.2%
P 1
9.1%
h 1
9.1%
y 1
9.1%
s 1
9.1%
i 1
9.1%

taxonConceptID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:39.429663image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPhyscia
ValueCountFrequency (%)
physcia 1
100.0%
2025-01-08T18:39:39.517684image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 1
14.3%
h 1
14.3%
y 1
14.3%
s 1
14.3%
c 1
14.3%
i 1
14.3%
a 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
h 1
16.7%
y 1
16.7%
s 1
16.7%
c 1
16.7%
i 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 1
14.3%
h 1
14.3%
y 1
14.3%
s 1
14.3%
c 1
14.3%
i 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 1
14.3%
h 1
14.3%
y 1
14.3%
s 1
14.3%
c 1
14.3%
i 1
14.3%
a 1
14.3%
Distinct160037
Distinct (%)19.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:39.708718image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length83
Mean length28.67196559
Min length5

Characters and Unicode

Total characters23975727
Distinct characters115
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76667 ?
Unique (%)9.2%

Sample

1st rowPlantago L.
2nd rowShorea platycarpa F.Heim
3rd rowPlantago L.
4th rowAgathis borneensis Warb.
5th rowPlantago L.
ValueCountFrequency (%)
l 225133
 
7.4%
62415
 
2.1%
ex 50021
 
1.6%
blume 32940
 
1.1%
var 30936
 
1.0%
subsp 23446
 
0.8%
dc 18796
 
0.6%
benth 14486
 
0.5%
miq 12616
 
0.4%
willd 10991
 
0.4%
Other values (67507) 2550600
84.1%
2025-01-08T18:39:39.986300image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2206652
 
9.2%
2196172
 
9.2%
i 1707484
 
7.1%
e 1494652
 
6.2%
r 1330069
 
5.5%
l 1202053
 
5.0%
s 1153252
 
4.8%
o 1128720
 
4.7%
. 1104465
 
4.6%
u 1070563
 
4.5%
Other values (105) 9381645
39.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17702223
73.8%
Uppercase Letter 2299837
 
9.6%
Space Separator 2196172
 
9.2%
Other Punctuation 1183273
 
4.9%
Close Punctuation 264993
 
1.1%
Open Punctuation 264993
 
1.1%
Decimal Number 49356
 
0.2%
Dash Punctuation 10840
 
< 0.1%
Math Symbol 4027
 
< 0.1%
Connector Punctuation 13
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2206652
12.5%
i 1707484
 
9.6%
e 1494652
 
8.4%
r 1330069
 
7.5%
l 1202053
 
6.8%
s 1153252
 
6.5%
o 1128720
 
6.4%
u 1070563
 
6.0%
n 1065027
 
6.0%
t 874530
 
4.9%
Other values (49) 4469221
25.2%
Uppercase Letter
ValueCountFrequency (%)
L 328795
14.3%
C 203205
 
8.8%
S 191787
 
8.3%
B 176982
 
7.7%
P 147829
 
6.4%
M 144977
 
6.3%
A 143506
 
6.2%
H 124768
 
5.4%
D 116747
 
5.1%
R 109199
 
4.7%
Other values (26) 612042
26.6%
Decimal Number
ValueCountFrequency (%)
1 14588
29.6%
8 10406
21.1%
9 5352
 
10.8%
4 3245
 
6.6%
7 3237
 
6.6%
3 3172
 
6.4%
2 3063
 
6.2%
0 2636
 
5.3%
5 2070
 
4.2%
6 1587
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 1104465
93.3%
& 62415
 
5.3%
, 14943
 
1.3%
' 1450
 
0.1%
Space Separator
ValueCountFrequency (%)
2196172
100.0%
Close Punctuation
ValueCountFrequency (%)
) 264993
100.0%
Open Punctuation
ValueCountFrequency (%)
( 264993
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10840
100.0%
Math Symbol
ValueCountFrequency (%)
× 4027
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20002060
83.4%
Common 3973667
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2206652
 
11.0%
i 1707484
 
8.5%
e 1494652
 
7.5%
r 1330069
 
6.6%
l 1202053
 
6.0%
s 1153252
 
5.8%
o 1128720
 
5.6%
u 1070563
 
5.4%
n 1065027
 
5.3%
t 874530
 
4.4%
Other values (85) 6769058
33.8%
Common
ValueCountFrequency (%)
2196172
55.3%
. 1104465
27.8%
) 264993
 
6.7%
( 264993
 
6.7%
& 62415
 
1.6%
, 14943
 
0.4%
1 14588
 
0.4%
- 10840
 
0.3%
8 10406
 
0.3%
9 5352
 
0.1%
Other values (10) 24500
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23942518
99.9%
None 33209
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2206652
 
9.2%
2196172
 
9.2%
i 1707484
 
7.1%
e 1494652
 
6.2%
r 1330069
 
5.6%
l 1202053
 
5.0%
s 1153252
 
4.8%
o 1128720
 
4.7%
. 1104465
 
4.6%
u 1070563
 
4.5%
Other values (61) 9348436
39.0%
None
ValueCountFrequency (%)
ü 13768
41.5%
é 7350
22.1%
× 4027
 
12.1%
ö 2099
 
6.3%
ä 1305
 
3.9%
á 858
 
2.6%
ó 794
 
2.4%
è 716
 
2.2%
ø 510
 
1.5%
ê 209
 
0.6%
Other values (34) 1573
 
4.7%

originalNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:40.036763image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowcaesia
ValueCountFrequency (%)
caesia 1
100.0%
2025-01-08T18:39:40.122403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
33.3%
c 1
16.7%
e 1
16.7%
s 1
16.7%
i 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
c 1
16.7%
e 1
16.7%
s 1
16.7%
i 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 6
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
33.3%
c 1
16.7%
e 1
16.7%
s 1
16.7%
i 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
33.3%
c 1
16.7%
e 1
16.7%
s 1
16.7%
i 1
16.7%

namePublishedInYear
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:40.159406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSPECIES
ValueCountFrequency (%)
species 1
100.0%
2025-01-08T18:39:40.245159image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 2
28.6%
E 2
28.6%
P 1
14.3%
C 1
14.3%
I 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2
28.6%
E 2
28.6%
P 1
14.3%
C 1
14.3%
I 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 2
28.6%
E 2
28.6%
P 1
14.3%
C 1
14.3%
I 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 2
28.6%
E 2
28.6%
P 1
14.3%
C 1
14.3%
I 1
14.3%
Distinct1179
Distinct (%)0.1%
Missing93
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:40.351701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length67
Mean length29.8587995
Min length9

Characters and Unicode

Total characters24965420
Distinct characters57
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)< 0.1%

Sample

1st rowPlantae|Lamiales|Plantaginaceae
2nd rowPlantae|Malvales|Dipterocarpaceae
3rd rowPlantae|Lamiales|Plantaginaceae
4th rowPlantae|Cupressales|Araucariaceae
5th rowPlantae|Lamiales|Plantaginaceae
ValueCountFrequency (%)
plantae|fabales|fabaceae 52176
 
6.2%
plantae|asterales|asteraceae 51643
 
6.2%
plantae|poales|poaceae 43654
 
5.2%
plantae|gentianales|rubiaceae 32700
 
3.9%
plantae|poales|cyperaceae 22272
 
2.7%
plantae|lamiales|lamiaceae 20217
 
2.4%
plantae|rosales|rosaceae 19433
 
2.3%
plantae|asparagales|orchidaceae 16003
 
1.9%
plantae|malpighiales|euphorbiaceae 15183
 
1.8%
plantae|malvales|malvaceae 13567
 
1.6%
Other values (1176) 551185
65.8%
2025-01-08T18:39:40.549761image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5102548
20.4%
e 3832109
15.3%
l 2185070
8.8%
| 1696429
 
6.8%
n 1366109
 
5.5%
t 1256578
 
5.0%
s 1222815
 
4.9%
c 1172986
 
4.7%
P 1048270
 
4.2%
i 944495
 
3.8%
Other values (47) 5138011
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20676808
82.8%
Uppercase Letter 2556492
 
10.2%
Math Symbol 1696429
 
6.8%
Dash Punctuation 28115
 
0.1%
Other Punctuation 5659
 
< 0.1%
Space Separator 1917
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5102548
24.7%
e 3832109
18.5%
l 2185070
10.6%
n 1366109
 
6.6%
t 1256578
 
6.1%
s 1222815
 
5.9%
c 1172986
 
5.7%
i 944495
 
4.6%
r 705505
 
3.4%
o 683724
 
3.3%
Other values (16) 2204869
10.7%
Uppercase Letter
ValueCountFrequency (%)
P 1048270
41.0%
A 258499
 
10.1%
M 191774
 
7.5%
C 169318
 
6.6%
F 162045
 
6.3%
L 132794
 
5.2%
R 131130
 
5.1%
S 98924
 
3.9%
G 76556
 
3.0%
E 71800
 
2.8%
Other values (16) 215382
 
8.4%
Other Punctuation
ValueCountFrequency (%)
? 3255
57.5%
. 2404
42.5%
Math Symbol
ValueCountFrequency (%)
| 1696429
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28115
100.0%
Space Separator
ValueCountFrequency (%)
1917
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23233300
93.1%
Common 1732120
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5102548
22.0%
e 3832109
16.5%
l 2185070
9.4%
n 1366109
 
5.9%
t 1256578
 
5.4%
s 1222815
 
5.3%
c 1172986
 
5.0%
P 1048270
 
4.5%
i 944495
 
4.1%
r 705505
 
3.0%
Other values (42) 4396815
18.9%
Common
ValueCountFrequency (%)
| 1696429
97.9%
- 28115
 
1.6%
? 3255
 
0.2%
. 2404
 
0.1%
1917
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24965420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5102548
20.4%
e 3832109
15.3%
l 2185070
8.8%
| 1696429
 
6.8%
n 1366109
 
5.5%
t 1256578
 
5.0%
s 1222815
 
4.9%
c 1172986
 
4.7%
P 1048270
 
4.2%
i 944495
 
3.8%
Other values (47) 5138011
20.6%
Distinct7
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:40.605352image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length7
Mean length6.97992124
Min length5

Characters and Unicode

Total characters5836659
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowPlantae
2nd rowPlantae
3rd rowPlantae
4th rowPlantae
5th rowPlantae
ValueCountFrequency (%)
plantae 810590
96.9%
fungi 16418
 
2.0%
chromista 6571
 
0.8%
bacteria 2508
 
0.3%
protozoa 73
 
< 0.1%
incertae 46
 
< 0.1%
sedis 46
 
< 0.1%
animalia 1
 
< 0.1%
2025-01-08T18:39:40.696796image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1632888
28.0%
n 827055
14.2%
t 819788
14.0%
e 813236
13.9%
P 810663
13.9%
l 810591
13.9%
i 25591
 
0.4%
F 16418
 
0.3%
u 16418
 
0.3%
g 16418
 
0.3%
Other values (12) 47593
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5000452
85.7%
Uppercase Letter 836161
 
14.3%
Space Separator 46
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1632888
32.7%
n 827055
16.5%
t 819788
16.4%
e 813236
16.3%
l 810591
16.2%
i 25591
 
0.5%
u 16418
 
0.3%
g 16418
 
0.3%
r 9198
 
0.2%
o 6790
 
0.1%
Other values (6) 22479
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
P 810663
97.0%
F 16418
 
2.0%
C 6571
 
0.8%
B 2508
 
0.3%
A 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
46
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5836613
> 99.9%
Common 46
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1632888
28.0%
n 827055
14.2%
t 819788
14.0%
e 813236
13.9%
P 810663
13.9%
l 810591
13.9%
i 25591
 
0.4%
F 16418
 
0.3%
u 16418
 
0.3%
g 16418
 
0.3%
Other values (11) 47547
 
0.8%
Common
ValueCountFrequency (%)
46
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5836659
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1632888
28.0%
n 827055
14.2%
t 819788
14.0%
e 813236
13.9%
P 810663
13.9%
l 810591
13.9%
i 25591
 
0.4%
F 16418
 
0.3%
u 16418
 
0.3%
g 16418
 
0.3%
Other values (12) 47593
 
0.8%

phylum
Text

Distinct25
Distinct (%)< 0.1%
Missing4178
Missing (%)0.5%
Memory size6.4 MiB
2025-01-08T18:39:40.748288image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length12
Mean length11.90524151
Min length3

Characters and Unicode

Total characters9905530
Distinct characters32
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowTracheophyta
2nd rowTracheophyta
3rd rowTracheophyta
4th rowTracheophyta
5th rowTracheophyta
ValueCountFrequency (%)
tracheophyta 772865
92.9%
rhodophyta 12168
 
1.5%
bryophyta 10128
 
1.2%
basidiomycota 8611
 
1.0%
chlorophyta 7692
 
0.9%
ascomycota 7630
 
0.9%
ochrophyta 6511
 
0.8%
cyanobacteria 2493
 
0.3%
charophyta 2034
 
0.2%
marchantiophyta 1729
 
0.2%
Other values (15) 170
 
< 0.1%
2025-01-08T18:39:40.860883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1623994
16.4%
h 1616162
16.3%
o 868326
8.8%
y 842138
8.5%
t 833784
8.4%
p 813147
8.2%
c 807626
8.2%
r 803498
8.1%
e 775465
7.8%
T 772865
7.8%
Other values (22) 148525
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9073497
91.6%
Uppercase Letter 832033
 
8.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1623994
17.9%
h 1616162
17.8%
o 868326
9.6%
y 842138
9.3%
t 833784
9.2%
p 813147
9.0%
c 807626
8.9%
r 803498
8.9%
e 775465
8.5%
i 21460
 
0.2%
Other values (9) 67897
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
T 772865
92.9%
B 18739
 
2.3%
C 12224
 
1.5%
R 12168
 
1.5%
A 7647
 
0.9%
O 6558
 
0.8%
M 1809
 
0.2%
E 10
 
< 0.1%
P 8
 
< 0.1%
H 2
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 9905530
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1623994
16.4%
h 1616162
16.3%
o 868326
8.8%
y 842138
8.5%
t 833784
8.4%
p 813147
8.2%
c 807626
8.2%
r 803498
8.1%
e 775465
7.8%
T 772865
7.8%
Other values (22) 148525
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9905530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1623994
16.4%
h 1616162
16.3%
o 868326
8.8%
y 842138
8.5%
t 833784
8.4%
p 813147
8.2%
c 807626
8.2%
r 803498
8.1%
e 775465
7.8%
T 772865
7.8%
Other values (22) 148525
 
1.5%

class
Text

Distinct76
Distinct (%)< 0.1%
Missing4394
Missing (%)0.5%
Memory size6.4 MiB
2025-01-08T18:39:40.917883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length13
Mean length12.58699591
Min length7

Characters and Unicode

Total characters10470052
Distinct characters43
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowMagnoliopsida
2nd rowMagnoliopsida
3rd rowMagnoliopsida
4th rowPinopsida
5th rowMagnoliopsida
ValueCountFrequency (%)
magnoliopsida 602481
72.4%
liliopsida 124141
 
14.9%
polypodiopsida 37585
 
4.5%
florideophyceae 11412
 
1.4%
bryopsida 9385
 
1.1%
agaricomycetes 8120
 
1.0%
phaeophyceae 5400
 
0.6%
ulvophyceae 5312
 
0.6%
lecanoromycetes 4420
 
0.5%
lycopodiopsida 3965
 
0.5%
Other values (66) 19594
 
2.4%
2025-01-08T18:39:41.034796image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1713580
16.4%
o 1539272
14.7%
a 1448471
13.8%
p 855262
8.2%
d 839665
8.0%
s 801318
7.7%
l 785417
7.5%
n 621120
 
5.9%
g 613876
 
5.9%
M 602686
 
5.8%
Other values (33) 649385
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9638230
92.1%
Uppercase Letter 831822
 
7.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1713580
17.8%
o 1539272
16.0%
a 1448471
15.0%
p 855262
8.9%
d 839665
8.7%
s 801318
8.3%
l 785417
8.1%
n 621120
 
6.4%
g 613876
 
6.4%
e 118229
 
1.2%
Other values (13) 302020
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
M 602686
72.5%
L 133228
 
16.0%
P 48442
 
5.8%
F 11412
 
1.4%
B 10605
 
1.3%
A 8316
 
1.0%
C 6387
 
0.8%
U 5327
 
0.6%
J 1598
 
0.2%
S 1174
 
0.1%
Other values (10) 2647
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 10470052
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1713580
16.4%
o 1539272
14.7%
a 1448471
13.8%
p 855262
8.2%
d 839665
8.0%
s 801318
7.7%
l 785417
7.5%
n 621120
 
5.9%
g 613876
 
5.9%
M 602686
 
5.8%
Other values (33) 649385
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10470052
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1713580
16.4%
o 1539272
14.7%
a 1448471
13.8%
p 855262
8.2%
d 839665
8.0%
s 801318
7.7%
l 785417
7.5%
n 621120
 
5.9%
g 613876
 
5.9%
M 602686
 
5.8%
Other values (33) 649385
 
6.2%

order
Text

Distinct379
Distinct (%)< 0.1%
Missing6630
Missing (%)0.8%
Memory size6.4 MiB
2025-01-08T18:39:41.209020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length17
Mean length9.454319601
Min length6

Characters and Unicode

Total characters7843105
Distinct characters49
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)< 0.1%

Sample

1st rowLamiales
2nd rowMalvales
3rd rowLamiales
4th rowPinales
5th rowLamiales
ValueCountFrequency (%)
poales 73718
 
8.9%
asterales 57572
 
6.9%
malpighiales 56399
 
6.8%
fabales 55446
 
6.7%
lamiales 55104
 
6.6%
gentianales 52371
 
6.3%
rosales 40401
 
4.9%
ericales 30937
 
3.7%
caryophyllales 30623
 
3.7%
polypodiales 28749
 
3.5%
Other values (369) 348259
42.0%
2025-01-08T18:39:41.451387image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1361922
17.4%
l 1101125
14.0%
s 1014961
12.9%
e 1006510
12.8%
i 484809
 
6.2%
o 286851
 
3.7%
r 285506
 
3.6%
n 249315
 
3.2%
p 211179
 
2.7%
t 191436
 
2.4%
Other values (39) 1649491
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7013526
89.4%
Uppercase Letter 829579
 
10.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1361922
19.4%
l 1101125
15.7%
s 1014961
14.5%
e 1006510
14.4%
i 484809
 
6.9%
o 286851
 
4.1%
r 285506
 
4.1%
n 249315
 
3.6%
p 211179
 
3.0%
t 191436
 
2.7%
Other values (15) 819912
11.7%
Uppercase Letter
ValueCountFrequency (%)
M 120857
14.6%
P 120381
14.5%
A 115430
13.9%
L 73180
8.8%
F 64704
7.8%
G 61065
7.4%
C 61049
7.4%
S 55307
6.7%
R 55091
6.6%
E 34231
 
4.1%
Other values (14) 68284
8.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 7843105
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1361922
17.4%
l 1101125
14.0%
s 1014961
12.9%
e 1006510
12.8%
i 484809
 
6.2%
o 286851
 
3.7%
r 285506
 
3.6%
n 249315
 
3.2%
p 211179
 
2.7%
t 191436
 
2.4%
Other values (39) 1649491
21.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7843105
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1361922
17.4%
l 1101125
14.0%
s 1014961
12.9%
e 1006510
12.8%
i 484809
 
6.2%
o 286851
 
3.7%
r 285506
 
3.6%
n 249315
 
3.2%
p 211179
 
2.7%
t 191436
 
2.4%
Other values (39) 1649491
21.0%

family
Text

Distinct1416
Distinct (%)0.2%
Missing6695
Missing (%)0.8%
Memory size6.4 MiB
2025-01-08T18:39:41.613051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length21
Mean length10.77436909
Min length7

Characters and Unicode

Total characters8937490
Distinct characters62
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique176 ?
Unique (%)< 0.1%

Sample

1st rowPlantaginaceae
2nd rowDipterocarpaceae
3rd rowPlantaginaceae
4th rowAraucariaceae
5th rowPlantaginaceae
ValueCountFrequency (%)
fabaceae 52179
 
6.3%
asteraceae 51696
 
6.2%
poaceae 43659
 
5.3%
rubiaceae 32694
 
3.9%
cyperaceae 22275
 
2.7%
lamiaceae 20240
 
2.4%
rosaceae 19433
 
2.3%
orchidaceae 15993
 
1.9%
euphorbiaceae 15181
 
1.8%
malvaceae 13704
 
1.7%
Other values (1406) 542460
65.4%
2025-01-08T18:39:41.829243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2080969
23.3%
e 1910674
21.4%
c 993402
11.1%
i 393681
 
4.4%
r 388909
 
4.4%
o 332488
 
3.7%
n 276346
 
3.1%
l 274366
 
3.1%
t 234156
 
2.6%
s 176824
 
2.0%
Other values (52) 1875675
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8107939
90.7%
Uppercase Letter 829518
 
9.3%
Decimal Number 24
 
< 0.1%
Connector Punctuation 5
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2080969
25.7%
e 1910674
23.6%
c 993402
12.3%
i 393681
 
4.9%
r 388909
 
4.8%
o 332488
 
4.1%
n 276346
 
3.4%
l 274366
 
3.4%
t 234156
 
2.9%
s 176824
 
2.2%
Other values (16) 1046124
12.9%
Uppercase Letter
ValueCountFrequency (%)
A 133823
16.1%
P 118188
14.2%
C 94383
11.4%
R 75706
9.1%
M 62478
7.5%
F 58063
7.0%
L 46714
 
5.6%
S 42631
 
5.1%
E 35407
 
4.3%
B 33606
 
4.1%
Other values (15) 128519
15.5%
Decimal Number
ValueCountFrequency (%)
1 6
25.0%
4 5
20.8%
2 4
16.7%
5 2
 
8.3%
8 2
 
8.3%
6 2
 
8.3%
9 1
 
4.2%
7 1
 
4.2%
0 1
 
4.2%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8937457
> 99.9%
Common 33
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2080969
23.3%
e 1910674
21.4%
c 993402
11.1%
i 393681
 
4.4%
r 388909
 
4.4%
o 332488
 
3.7%
n 276346
 
3.1%
l 274366
 
3.1%
t 234156
 
2.6%
s 176824
 
2.0%
Other values (41) 1875642
21.0%
Common
ValueCountFrequency (%)
1 6
18.2%
_ 5
15.2%
4 5
15.2%
- 4
12.1%
2 4
12.1%
5 2
 
6.1%
8 2
 
6.1%
6 2
 
6.1%
9 1
 
3.0%
7 1
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8937490
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2080969
23.3%
e 1910674
21.4%
c 993402
11.1%
i 393681
 
4.4%
r 388909
 
4.4%
o 332488
 
3.7%
n 276346
 
3.1%
l 274366
 
3.1%
t 234156
 
2.6%
s 176824
 
2.0%
Other values (52) 1875675
21.0%

subfamily
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:41.882243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNL
ValueCountFrequency (%)
nl 1
100.0%
2025-01-08T18:39:41.970400image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1
50.0%
L 1
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
L 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1
50.0%
L 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1
50.0%
L 1
50.0%

tribe
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:42.016056image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters24
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2024-11-01T10:28:05.946Z
ValueCountFrequency (%)
2024-11-01t10:28:05.946z 1
100.0%
2025-01-08T18:39:42.111845image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4
16.7%
1 4
16.7%
2 3
12.5%
4 2
8.3%
- 2
8.3%
: 2
8.3%
T 1
 
4.2%
8 1
 
4.2%
5 1
 
4.2%
. 1
 
4.2%
Other values (3) 3
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17
70.8%
Other Punctuation 3
 
12.5%
Dash Punctuation 2
 
8.3%
Uppercase Letter 2
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4
23.5%
1 4
23.5%
2 3
17.6%
4 2
11.8%
8 1
 
5.9%
5 1
 
5.9%
9 1
 
5.9%
6 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22
91.7%
Latin 2
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4
18.2%
1 4
18.2%
2 3
13.6%
4 2
9.1%
- 2
9.1%
: 2
9.1%
8 1
 
4.5%
5 1
 
4.5%
. 1
 
4.5%
9 1
 
4.5%
Latin
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4
16.7%
1 4
16.7%
2 3
12.5%
4 2
8.3%
- 2
8.3%
: 2
8.3%
T 1
 
4.2%
8 1
 
4.2%
5 1
 
4.2%
. 1
 
4.2%
Other values (3) 3
12.5%

genus
Text

Missing 

Distinct13976
Distinct (%)1.7%
Missing13165
Missing (%)1.6%
Memory size6.4 MiB
2025-01-08T18:39:42.290788image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length20
Mean length8.628420838
Min length2

Characters and Unicode

Total characters7101570
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2459 ?
Unique (%)0.3%

Sample

1st rowPlantago
2nd rowShorea
3rd rowPlantago
4th rowAgathis
5th rowPlantago
ValueCountFrequency (%)
carex 9711
 
1.2%
ficus 7339
 
0.9%
rubus 6530
 
0.8%
taraxacum 5291
 
0.6%
cyperus 4059
 
0.5%
salix 3696
 
0.4%
ranunculus 3488
 
0.4%
galium 3355
 
0.4%
euphorbia 3348
 
0.4%
asplenium 3302
 
0.4%
Other values (13965) 772925
93.9%
2025-01-08T18:39:42.540957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 882159
 
12.4%
i 637415
 
9.0%
e 489322
 
6.9%
r 472496
 
6.7%
o 471851
 
6.6%
u 395415
 
5.6%
s 394685
 
5.6%
l 381894
 
5.4%
n 361563
 
5.1%
t 291795
 
4.1%
Other values (43) 2322975
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6278303
88.4%
Uppercase Letter 823109
 
11.6%
Dash Punctuation 158
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 882159
14.1%
i 637415
10.2%
e 489322
 
7.8%
r 472496
 
7.5%
o 471851
 
7.5%
u 395415
 
6.3%
s 394685
 
6.3%
l 381894
 
6.1%
n 361563
 
5.8%
t 291795
 
4.6%
Other values (16) 1499708
23.9%
Uppercase Letter
ValueCountFrequency (%)
C 112385
13.7%
P 87084
10.6%
S 79350
 
9.6%
A 78110
 
9.5%
M 48629
 
5.9%
D 43750
 
5.3%
L 43489
 
5.3%
T 40242
 
4.9%
E 39409
 
4.8%
G 37186
 
4.5%
Other values (16) 213475
25.9%
Dash Punctuation
ValueCountFrequency (%)
- 158
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7101412
> 99.9%
Common 158
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 882159
 
12.4%
i 637415
 
9.0%
e 489322
 
6.9%
r 472496
 
6.7%
o 471851
 
6.6%
u 395415
 
5.6%
s 394685
 
5.6%
l 381894
 
5.4%
n 361563
 
5.1%
t 291795
 
4.1%
Other values (42) 2322817
32.7%
Common
ValueCountFrequency (%)
- 158
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7101570
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 882159
 
12.4%
i 637415
 
9.0%
e 489322
 
6.9%
r 472496
 
6.7%
o 471851
 
6.6%
u 395415
 
5.6%
s 394685
 
5.6%
l 381894
 
5.4%
n 361563
 
5.1%
t 291795
 
4.1%
Other values (43) 2322975
32.7%

genericName
Text

Missing 

Distinct14992
Distinct (%)1.8%
Missing13241
Missing (%)1.6%
Memory size6.4 MiB
2025-01-08T18:39:42.738330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length19
Mean length8.528952523
Min length3

Characters and Unicode

Total characters7019055
Distinct characters55
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3301 ?
Unique (%)0.4%

Sample

1st rowPlantago
2nd rowShorea
3rd rowPlantago
4th rowAgathis
5th rowPlantago
ValueCountFrequency (%)
carex 9604
 
1.2%
ficus 7336
 
0.9%
rubus 6531
 
0.8%
taraxacum 5292
 
0.6%
hieracium 4623
 
0.6%
salix 3662
 
0.4%
ranunculus 3636
 
0.4%
cyperus 3522
 
0.4%
galium 3425
 
0.4%
juncus 3251
 
0.4%
Other values (14981) 772086
93.8%
2025-01-08T18:39:43.011643image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 868984
 
12.4%
i 638023
 
9.1%
e 479416
 
6.8%
r 468695
 
6.7%
o 461003
 
6.6%
u 398937
 
5.7%
s 388992
 
5.5%
l 364131
 
5.2%
n 358302
 
5.1%
t 287234
 
4.1%
Other values (45) 2305338
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6196012
88.3%
Uppercase Letter 822985
 
11.7%
Dash Punctuation 58
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 868984
14.0%
i 638023
10.3%
e 479416
 
7.7%
r 468695
 
7.6%
o 461003
 
7.4%
u 398937
 
6.4%
s 388992
 
6.3%
l 364131
 
5.9%
n 358302
 
5.8%
t 287234
 
4.6%
Other values (18) 1482295
23.9%
Uppercase Letter
ValueCountFrequency (%)
C 114403
13.9%
P 84684
 
10.3%
A 79730
 
9.7%
S 79497
 
9.7%
M 48353
 
5.9%
D 44247
 
5.4%
L 43302
 
5.3%
E 40630
 
4.9%
T 40116
 
4.9%
H 35828
 
4.4%
Other values (16) 212195
25.8%
Dash Punctuation
ValueCountFrequency (%)
- 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7018997
> 99.9%
Common 58
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 868984
 
12.4%
i 638023
 
9.1%
e 479416
 
6.8%
r 468695
 
6.7%
o 461003
 
6.6%
u 398937
 
5.7%
s 388992
 
5.5%
l 364131
 
5.2%
n 358302
 
5.1%
t 287234
 
4.1%
Other values (44) 2305280
32.8%
Common
ValueCountFrequency (%)
- 58
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7019043
> 99.9%
None 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 868984
 
12.4%
i 638023
 
9.1%
e 479416
 
6.8%
r 468695
 
6.7%
o 461003
 
6.6%
u 398937
 
5.7%
s 388992
 
5.5%
l 364131
 
5.2%
n 358302
 
5.1%
t 287234
 
4.1%
Other values (43) 2305326
32.8%
None
ValueCountFrequency (%)
ë 11
91.7%
ö 1
 
8.3%

specificEpithet
Text

Missing 

Distinct40036
Distinct (%)5.3%
Missing78237
Missing (%)9.4%
Memory size6.4 MiB
2025-01-08T18:39:43.182750image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length20
Mean length8.998581742
Min length2

Characters and Unicode

Total characters6820673
Distinct characters30
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13804 ?
Unique (%)1.8%

Sample

1st rowplatycarpa
2nd rowborneensis
3rd rowhopeifolia
4th rowhexandrum
5th rowovata
ValueCountFrequency (%)
vulgaris 4161
 
0.5%
palustris 3085
 
0.4%
arvensis 2985
 
0.4%
officinalis 2666
 
0.4%
indica 2512
 
0.3%
repens 2282
 
0.3%
maritima 2041
 
0.3%
alpina 1923
 
0.3%
vulgare 1822
 
0.2%
javanica 1815
 
0.2%
Other values (40026) 732680
96.7%
2025-01-08T18:39:43.414218image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 934265
13.7%
i 768666
11.3%
s 511342
 
7.5%
e 476245
 
7.0%
r 451502
 
6.6%
l 444825
 
6.5%
n 423483
 
6.2%
u 419441
 
6.1%
o 390489
 
5.7%
t 360947
 
5.3%
Other values (20) 1639468
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6815901
99.9%
Dash Punctuation 4772
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 934265
13.7%
i 768666
11.3%
s 511342
 
7.5%
e 476245
 
7.0%
r 451502
 
6.6%
l 444825
 
6.5%
n 423483
 
6.2%
u 419441
 
6.2%
o 390489
 
5.7%
t 360947
 
5.3%
Other values (19) 1634696
24.0%
Dash Punctuation
ValueCountFrequency (%)
- 4772
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6815901
99.9%
Common 4772
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 934265
13.7%
i 768666
11.3%
s 511342
 
7.5%
e 476245
 
7.0%
r 451502
 
6.6%
l 444825
 
6.5%
n 423483
 
6.2%
u 419441
 
6.2%
o 390489
 
5.7%
t 360947
 
5.3%
Other values (19) 1634696
24.0%
Common
ValueCountFrequency (%)
- 4772
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6820639
> 99.9%
None 34
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 934265
13.7%
i 768666
11.3%
s 511342
 
7.5%
e 476245
 
7.0%
r 451502
 
6.6%
l 444825
 
6.5%
n 423483
 
6.2%
u 419441
 
6.1%
o 390489
 
5.7%
t 360947
 
5.3%
Other values (17) 1639434
24.0%
None
ValueCountFrequency (%)
ï 30
88.2%
ë 3
 
8.8%
ü 1
 
2.9%

infraspecificEpithet
Text

Missing 

Distinct9364
Distinct (%)16.3%
Missing778925
Missing (%)93.1%
Memory size6.4 MiB
2025-01-08T18:39:43.571222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length20
Mean length9.13082187
Min length3

Characters and Unicode

Total characters523050
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4061 ?
Unique (%)7.1%

Sample

1st rowvelutinata
2nd rowmollis
3rd rowsycomoroides
4th rowglobifera
5th rowformosum
ValueCountFrequency (%)
angustifolia 326
 
0.6%
pubescens 301
 
0.5%
album 284
 
0.5%
vulgaris 276
 
0.5%
glabra 251
 
0.4%
major 240
 
0.4%
vulgare 236
 
0.4%
montana 210
 
0.4%
montanum 192
 
0.3%
repens 189
 
0.3%
Other values (9354) 54779
95.6%
2025-01-08T18:39:43.789090image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 70401
13.5%
i 58390
11.2%
s 39103
 
7.5%
e 37024
 
7.1%
l 35883
 
6.9%
r 34056
 
6.5%
u 33329
 
6.4%
n 31689
 
6.1%
o 30485
 
5.8%
t 28007
 
5.4%
Other values (17) 124683
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 522937
> 99.9%
Dash Punctuation 113
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 70401
13.5%
i 58390
11.2%
s 39103
 
7.5%
e 37024
 
7.1%
l 35883
 
6.9%
r 34056
 
6.5%
u 33329
 
6.4%
n 31689
 
6.1%
o 30485
 
5.8%
t 28007
 
5.4%
Other values (16) 124570
23.8%
Dash Punctuation
ValueCountFrequency (%)
- 113
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 522937
> 99.9%
Common 113
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 70401
13.5%
i 58390
11.2%
s 39103
 
7.5%
e 37024
 
7.1%
l 35883
 
6.9%
r 34056
 
6.5%
u 33329
 
6.4%
n 31689
 
6.1%
o 30485
 
5.8%
t 28007
 
5.4%
Other values (16) 124570
23.8%
Common
ValueCountFrequency (%)
- 113
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 523050
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 70401
13.5%
i 58390
11.2%
s 39103
 
7.5%
e 37024
 
7.1%
l 35883
 
6.9%
r 34056
 
6.5%
u 33329
 
6.4%
n 31689
 
6.1%
o 30485
 
5.8%
t 28007
 
5.4%
Other values (17) 124683
23.8%

cultivarEpithet
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:43.838550image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowtrue
ValueCountFrequency (%)
true 1
100.0%
2025-01-08T18:39:43.926267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 1
25.0%
r 1
25.0%
u 1
25.0%
e 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1
25.0%
r 1
25.0%
u 1
25.0%
e 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1
25.0%
r 1
25.0%
u 1
25.0%
e 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 1
25.0%
r 1
25.0%
u 1
25.0%
e 1
25.0%
Distinct11
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:43.971266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.906764824
Min length4

Characters and Unicode

Total characters5775492
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowGENUS
2nd rowSPECIES
3rd rowGENUS
4th rowSPECIES
5th rowGENUS
ValueCountFrequency (%)
species 700762
83.8%
genus 64996
 
7.8%
variety 30937
 
3.7%
subspecies 23447
 
2.8%
family 8933
 
1.1%
kingdom 3825
 
0.5%
form 2902
 
0.3%
class 244
 
< 0.1%
phylum 138
 
< 0.1%
order 23
 
< 0.1%
2025-01-08T18:39:44.172444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1544374
26.7%
S 1537349
26.6%
I 767904
13.3%
C 724453
12.5%
P 724347
12.5%
U 88581
 
1.5%
G 68821
 
1.2%
N 68821
 
1.2%
A 40114
 
0.7%
Y 40008
 
0.7%
Other values (16) 170720
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5775487
> 99.9%
Lowercase Letter 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1544374
26.7%
S 1537349
26.6%
I 767904
13.3%
C 724453
12.5%
P 724347
12.5%
U 88581
 
1.5%
G 68821
 
1.2%
N 68821
 
1.2%
A 40114
 
0.7%
Y 40008
 
0.7%
Other values (11) 170715
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5775492
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1544374
26.7%
S 1537349
26.6%
I 767904
13.3%
C 724453
12.5%
P 724347
12.5%
U 88581
 
1.5%
G 68821
 
1.2%
N 68821
 
1.2%
A 40114
 
0.7%
Y 40008
 
0.7%
Other values (16) 170720
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5775492
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1544374
26.7%
S 1537349
26.6%
I 767904
13.3%
C 724453
12.5%
P 724347
12.5%
U 88581
 
1.5%
G 68821
 
1.2%
N 68821
 
1.2%
A 40114
 
0.7%
Y 40008
 
0.7%
Other values (16) 170720
 
3.0%

verbatimTaxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:44.212110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2608920
ValueCountFrequency (%)
2608920 1
100.0%
2025-01-08T18:39:44.298801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

vernacularName
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:44.338423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2608920
ValueCountFrequency (%)
2608920 1
100.0%
2025-01-08T18:39:44.427038image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%
Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:44.468038image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.999997608
Min length1

Characters and Unicode

Total characters2508622
Distinct characters4
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowICN
2nd rowICN
3rd rowICN
4th rowICN
5th rowICN
ValueCountFrequency (%)
icn 836207
> 99.9%
5 1
 
< 0.1%
2025-01-08T18:39:44.569369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 836207
33.3%
C 836207
33.3%
N 836207
33.3%
5 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2508621
> 99.9%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 836207
33.3%
C 836207
33.3%
N 836207
33.3%
Decimal Number
ValueCountFrequency (%)
5 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2508621
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 836207
33.3%
C 836207
33.3%
N 836207
33.3%
Common
ValueCountFrequency (%)
5 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2508622
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 836207
33.3%
C 836207
33.3%
N 836207
33.3%
5 1
 
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing47
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:44.612370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.766258213
Min length2

Characters and Unicode

Total characters6493850
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowACCEPTED
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 630751
75.4%
synonym 195440
 
23.4%
doubtful 9970
 
1.2%
95 1
 
< 0.1%
2025-01-08T18:39:44.704829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1261502
19.4%
C 1261502
19.4%
T 640721
9.9%
D 640721
9.9%
A 630751
9.7%
P 630751
9.7%
Y 390880
 
6.0%
N 390880
 
6.0%
O 205410
 
3.2%
S 195440
 
3.0%
Other values (7) 245292
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6493848
> 99.9%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1261502
19.4%
C 1261502
19.4%
T 640721
9.9%
D 640721
9.9%
A 630751
9.7%
P 630751
9.7%
Y 390880
 
6.0%
N 390880
 
6.0%
O 205410
 
3.2%
S 195440
 
3.0%
Other values (5) 245290
 
3.8%
Decimal Number
ValueCountFrequency (%)
9 1
50.0%
5 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6493848
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1261502
19.4%
C 1261502
19.4%
T 640721
9.9%
D 640721
9.9%
A 630751
9.7%
P 630751
9.7%
Y 390880
 
6.0%
N 390880
 
6.0%
O 205410
 
3.2%
S 195440
 
3.0%
Other values (5) 245290
 
3.8%
Common
ValueCountFrequency (%)
9 1
50.0%
5 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6493850
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1261502
19.4%
C 1261502
19.4%
T 640721
9.9%
D 640721
9.9%
A 630751
9.7%
P 630751
9.7%
Y 390880
 
6.0%
N 390880
 
6.0%
O 205410
 
3.2%
S 195440
 
3.0%
Other values (7) 245292
 
3.8%

nomenclaturalStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:44.746138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row180
ValueCountFrequency (%)
180 1
100.0%
2025-01-08T18:39:44.834209image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1
33.3%
8 1
33.3%
0 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
33.3%
8 1
33.3%
0 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1
33.3%
8 1
33.3%
0 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1
33.3%
8 1
33.3%
0 1
33.3%

taxonRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:44.875323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row10861608
ValueCountFrequency (%)
10861608 1
100.0%
2025-01-08T18:39:44.966366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 2
25.0%
6 2
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 2
25.0%
6 2
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 2
25.0%
6 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
0 2
25.0%
8 2
25.0%
6 2
25.0%
Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:45.018366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length35.99996173
Min length4

Characters and Unicode

Total characters30103456
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row15f819bd-6612-4447-854b-14d12ee1022d
2nd row15f819bd-6612-4447-854b-14d12ee1022d
3rd row15f819bd-6612-4447-854b-14d12ee1022d
4th row15f819bd-6612-4447-854b-14d12ee1022d
5th row15f819bd-6612-4447-854b-14d12ee1022d
ValueCountFrequency (%)
15f819bd-6612-4447-854b-14d12ee1022d 836207
> 99.9%
8369 1
 
< 0.1%
2025-01-08T18:39:45.122371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5017242
16.7%
4 4181035
13.9%
- 3344828
11.1%
2 3344828
11.1%
d 2508621
8.3%
8 1672415
 
5.6%
6 1672415
 
5.6%
5 1672414
 
5.6%
b 1672414
 
5.6%
e 1672414
 
5.6%
Other values (5) 3344830
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20068972
66.7%
Lowercase Letter 6689656
 
22.2%
Dash Punctuation 3344828
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5017242
25.0%
4 4181035
20.8%
2 3344828
16.7%
8 1672415
 
8.3%
6 1672415
 
8.3%
5 1672414
 
8.3%
9 836208
 
4.2%
7 836207
 
4.2%
0 836207
 
4.2%
3 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
d 2508621
37.5%
b 1672414
25.0%
e 1672414
25.0%
f 836207
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 3344828
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23413800
77.8%
Latin 6689656
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5017242
21.4%
4 4181035
17.9%
- 3344828
14.3%
2 3344828
14.3%
8 1672415
 
7.1%
6 1672415
 
7.1%
5 1672414
 
7.1%
9 836208
 
3.6%
7 836207
 
3.6%
0 836207
 
3.6%
Latin
ValueCountFrequency (%)
d 2508621
37.5%
b 1672414
25.0%
e 1672414
25.0%
f 836207
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30103456
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5017242
16.7%
4 4181035
13.9%
- 3344828
11.1%
2 3344828
11.1%
d 2508621
8.3%
8 1672415
 
5.6%
6 1672415
 
5.6%
5 1672414
 
5.6%
b 1672414
 
5.6%
e 1672414
 
5.6%
Other values (5) 3344830
11.1%
Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:45.161372image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.000005979
Min length2

Characters and Unicode

Total characters1672421
Distinct characters7
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNL
2nd rowNL
3rd rowNL
4th rowNL
5th rowNL
ValueCountFrequency (%)
nl 836207
> 99.9%
2600367 1
 
< 0.1%
2025-01-08T18:39:45.249023image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 836207
50.0%
L 836207
50.0%
6 2
 
< 0.1%
0 2
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
7 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1672414
> 99.9%
Decimal Number 7
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 2
28.6%
0 2
28.6%
2 1
14.3%
3 1
14.3%
7 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
N 836207
50.0%
L 836207
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1672414
> 99.9%
Common 7
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
6 2
28.6%
0 2
28.6%
2 1
14.3%
3 1
14.3%
7 1
14.3%
Latin
ValueCountFrequency (%)
N 836207
50.0%
L 836207
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1672421
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 836207
50.0%
L 836207
50.0%
6 2
 
< 0.1%
0 2
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
7 1
 
< 0.1%
Distinct151288
Distinct (%)18.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:45.376420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99604404
Min length20

Characters and Unicode

Total characters20065660
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19524 ?
Unique (%)2.3%

Sample

1st row2024-11-01T10:27:16.300Z
2nd row2024-11-01T10:29:04.857Z
3rd row2024-11-01T10:27:16.301Z
4th row2024-11-01T10:29:41.603Z
5th row2024-11-01T10:27:17.382Z
ValueCountFrequency (%)
2024-11-01t10:27:17.419z 32
 
< 0.1%
2024-11-01t10:26:47.509z 30
 
< 0.1%
2024-11-01t10:27:17.556z 29
 
< 0.1%
2024-11-01t10:27:17.502z 28
 
< 0.1%
2024-11-01t10:28:04.529z 28
 
< 0.1%
2024-11-01t10:27:28.429z 28
 
< 0.1%
2024-11-01t10:27:27.691z 28
 
< 0.1%
2024-11-01t10:27:17.495z 28
 
< 0.1%
2024-11-01t10:27:16.167z 28
 
< 0.1%
2024-11-01t10:27:02.841z 28
 
< 0.1%
Other values (151278) 835920
> 99.9%
2025-01-08T18:39:45.575251image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3864670
19.3%
0 3022309
15.1%
2 2963708
14.8%
- 1672414
8.3%
: 1672414
8.3%
4 1297614
 
6.5%
T 836207
 
4.2%
Z 836207
 
4.2%
. 835380
 
4.2%
7 673118
 
3.4%
Other values (5) 2391619
11.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14213038
70.8%
Other Punctuation 2507794
 
12.5%
Dash Punctuation 1672414
 
8.3%
Uppercase Letter 1672414
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3864670
27.2%
0 3022309
21.3%
2 2963708
20.9%
4 1297614
 
9.1%
7 673118
 
4.7%
8 599914
 
4.2%
9 470980
 
3.3%
5 457024
 
3.2%
3 449982
 
3.2%
6 413719
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 1672414
66.7%
. 835380
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 836207
50.0%
Z 836207
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1672414
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18393246
91.7%
Latin 1672414
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3864670
21.0%
0 3022309
16.4%
2 2963708
16.1%
- 1672414
9.1%
: 1672414
9.1%
4 1297614
 
7.1%
. 835380
 
4.5%
7 673118
 
3.7%
8 599914
 
3.3%
9 470980
 
2.6%
Other values (3) 1320725
 
7.2%
Latin
ValueCountFrequency (%)
T 836207
50.0%
Z 836207
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20065660
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3864670
19.3%
0 3022309
15.1%
2 2963708
14.8%
- 1672414
8.3%
: 1672414
8.3%
4 1297614
 
6.5%
T 836207
 
4.2%
Z 836207
 
4.2%
. 835380
 
4.2%
7 673118
 
3.4%
Other values (5) 2391619
11.9%

elevation
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:45.625561image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row2608920
ValueCountFrequency (%)
2608920 1
100.0%
2025-01-08T18:39:45.714212image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
28.6%
0 2
28.6%
6 1
14.3%
8 1
14.3%
9 1
14.3%

elevationAccuracy
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:45.752983image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters14
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPhyscia caesia
ValueCountFrequency (%)
physcia 1
50.0%
caesia 1
50.0%
2025-01-08T18:39:45.842675image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
21.4%
s 2
14.3%
c 2
14.3%
i 2
14.3%
P 1
 
7.1%
h 1
 
7.1%
y 1
 
7.1%
1
 
7.1%
e 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
85.7%
Uppercase Letter 1
 
7.1%
Space Separator 1
 
7.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
25.0%
s 2
16.7%
c 2
16.7%
i 2
16.7%
h 1
 
8.3%
y 1
 
8.3%
e 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
92.9%
Common 1
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
23.1%
s 2
15.4%
c 2
15.4%
i 2
15.4%
P 1
 
7.7%
h 1
 
7.7%
y 1
 
7.7%
e 1
 
7.7%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
21.4%
s 2
14.3%
c 2
14.3%
i 2
14.3%
P 1
 
7.1%
h 1
 
7.1%
y 1
 
7.1%
1
 
7.1%
e 1
 
7.1%

depth
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:45.885675image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length30
Mean length30
Min length30

Characters and Unicode

Total characters30
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPhyscia caesia (Hoffm.) Fürnr.
ValueCountFrequency (%)
physcia 1
25.0%
caesia 1
25.0%
hoffm 1
25.0%
fürnr 1
25.0%
2025-01-08T18:39:45.978668image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
 
10.0%
3
 
10.0%
r 2
 
6.7%
s 2
 
6.7%
c 2
 
6.7%
i 2
 
6.7%
. 2
 
6.7%
f 2
 
6.7%
P 1
 
3.3%
m 1
 
3.3%
Other values (10) 10
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
66.7%
Space Separator 3
 
10.0%
Uppercase Letter 3
 
10.0%
Other Punctuation 2
 
6.7%
Close Punctuation 1
 
3.3%
Open Punctuation 1
 
3.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
15.0%
r 2
10.0%
s 2
10.0%
c 2
10.0%
i 2
10.0%
f 2
10.0%
m 1
 
5.0%
ü 1
 
5.0%
o 1
 
5.0%
h 1
 
5.0%
Other values (3) 3
15.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
F 1
33.3%
H 1
33.3%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
76.7%
Common 7
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
13.0%
r 2
 
8.7%
s 2
 
8.7%
c 2
 
8.7%
i 2
 
8.7%
f 2
 
8.7%
P 1
 
4.3%
m 1
 
4.3%
ü 1
 
4.3%
F 1
 
4.3%
Other values (6) 6
26.1%
Common
ValueCountFrequency (%)
3
42.9%
. 2
28.6%
) 1
 
14.3%
( 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29
96.7%
None 1
 
3.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
 
10.3%
3
 
10.3%
r 2
 
6.9%
s 2
 
6.9%
c 2
 
6.9%
i 2
 
6.9%
. 2
 
6.9%
f 2
 
6.9%
P 1
 
3.4%
m 1
 
3.4%
Other values (9) 9
31.0%
None
ValueCountFrequency (%)
ü 1
100.0%

depthAccuracy
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:46.028143image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length39
Mean length39
Min length39

Characters and Unicode

Total characters39
Distinct characters22
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPhyscia caesia (Hoffm.) Hampe ex Fürnr.
ValueCountFrequency (%)
physcia 1
16.7%
caesia 1
16.7%
hoffm 1
16.7%
hampe 1
16.7%
ex 1
16.7%
fürnr 1
16.7%
2025-01-08T18:39:46.130930image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
12.8%
a 4
 
10.3%
e 3
 
7.7%
r 2
 
5.1%
s 2
 
5.1%
c 2
 
5.1%
i 2
 
5.1%
H 2
 
5.1%
f 2
 
5.1%
m 2
 
5.1%
Other values (12) 13
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26
66.7%
Space Separator 5
 
12.8%
Uppercase Letter 4
 
10.3%
Other Punctuation 2
 
5.1%
Close Punctuation 1
 
2.6%
Open Punctuation 1
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
15.4%
e 3
11.5%
r 2
 
7.7%
s 2
 
7.7%
c 2
 
7.7%
i 2
 
7.7%
f 2
 
7.7%
m 2
 
7.7%
p 1
 
3.8%
ü 1
 
3.8%
Other values (5) 5
19.2%
Uppercase Letter
ValueCountFrequency (%)
H 2
50.0%
P 1
25.0%
F 1
25.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30
76.9%
Common 9
 
23.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
13.3%
e 3
 
10.0%
r 2
 
6.7%
s 2
 
6.7%
c 2
 
6.7%
i 2
 
6.7%
H 2
 
6.7%
f 2
 
6.7%
m 2
 
6.7%
P 1
 
3.3%
Other values (8) 8
26.7%
Common
ValueCountFrequency (%)
5
55.6%
. 2
 
22.2%
) 1
 
11.1%
( 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38
97.4%
None 1
 
2.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5
13.2%
a 4
 
10.5%
e 3
 
7.9%
r 2
 
5.3%
s 2
 
5.3%
c 2
 
5.3%
i 2
 
5.3%
H 2
 
5.3%
f 2
 
5.3%
m 2
 
5.3%
Other values (11) 12
31.6%
None
ValueCountFrequency (%)
ü 1
100.0%
Distinct360
Distinct (%)11.7%
Missing833143
Missing (%)99.6%
Memory size6.4 MiB
2025-01-08T18:39:46.244311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length12.61448141
Min length3

Characters and Unicode

Total characters38676
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155 ?
Unique (%)5.1%

Sample

1st row0.0
2nd row922.1985434932673
3rd row0.0
4th row2546.249171408145
5th row2546.249171408145
ValueCountFrequency (%)
0.0 1013
33.0%
922.1985434932673 188
 
6.1%
2546.249171408145 115
 
3.8%
3183.772359296243 101
 
3.3%
2983.0798593133177 95
 
3.1%
4504.128742457356 95
 
3.1%
4746.962209460676 53
 
1.7%
4281.9160661722035 48
 
1.6%
3983.9929662504123 41
 
1.3%
2239.9416356999986 34
 
1.1%
Other values (350) 1283
41.8%
2025-01-08T18:39:46.426863image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4412
11.4%
4 4164
10.8%
2 3985
10.3%
3 3884
10.0%
9 3710
9.6%
1 3309
8.6%
5 3249
8.4%
6 3086
8.0%
. 3066
7.9%
7 2990
7.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 35610
92.1%
Other Punctuation 3066
 
7.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4412
12.4%
4 4164
11.7%
2 3985
11.2%
3 3884
10.9%
9 3710
10.4%
1 3309
9.3%
5 3249
9.1%
6 3086
8.7%
7 2990
8.4%
8 2821
7.9%
Other Punctuation
ValueCountFrequency (%)
. 3066
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 38676
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4412
11.4%
4 4164
10.8%
2 3985
10.3%
3 3884
10.0%
9 3710
9.6%
1 3309
8.6%
5 3249
8.4%
6 3086
8.0%
. 3066
7.9%
7 2990
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38676
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4412
11.4%
4 4164
10.8%
2 3985
10.3%
3 3884
10.0%
9 3710
9.6%
1 3309
8.6%
5 3249
8.4%
6 3086
8.0%
. 3066
7.9%
7 2990
7.7%

issue
Text

Missing 

Distinct62
Distinct (%)0.1%
Missing776215
Missing (%)92.8%
Memory size6.4 MiB
2025-01-08T18:39:46.503877image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length107
Median length22
Mean length24.5116845
Min length11

Characters and Unicode

Total characters1470554
Distinct characters26
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowTAXON_MATCH_HIGHERRANK
2nd rowTAXON_MATCH_HIGHERRANK
3rd rowTAXON_MATCH_HIGHERRANK
4th rowTAXON_MATCH_HIGHERRANK
5th rowPRESUMED_NEGATED_LATITUDE
ValueCountFrequency (%)
taxon_match_higherrank 30836
51.4%
taxon_match_fuzzy 11173
 
18.6%
continent_coordinate_mismatch 7741
 
12.9%
continent_country_mismatch 2069
 
3.4%
country_invalid 2031
 
3.4%
country_coordinate_mismatch 946
 
1.6%
country_derived_from_coordinates;country_invalid 906
 
1.5%
country_coordinate_mismatch;continent_derived_from_coordinates 858
 
1.4%
continent_coordinate_mismatch;continent_country_mismatch 764
 
1.3%
country_coordinate_mismatch;continent_coordinate_mismatch 540
 
0.9%
Other values (52) 2130
 
3.6%
2025-01-08T18:39:46.636909image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 155712
10.6%
A 152174
10.3%
N 145950
9.9%
_ 128614
8.7%
H 121531
 
8.3%
O 99418
 
6.8%
C 97399
 
6.6%
R 93204
 
6.3%
I 85718
 
5.8%
M 76728
 
5.2%
Other values (16) 314106
21.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1337121
90.9%
Connector Punctuation 128614
 
8.7%
Other Punctuation 4819
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 155712
11.6%
A 152174
11.4%
N 145950
10.9%
H 121531
9.1%
O 99418
7.4%
C 97399
7.3%
R 93204
 
7.0%
I 85718
 
6.4%
M 76728
 
5.7%
E 67816
 
5.1%
Other values (14) 241471
18.1%
Connector Punctuation
ValueCountFrequency (%)
_ 128614
100.0%
Other Punctuation
ValueCountFrequency (%)
; 4819
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1337121
90.9%
Common 133433
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 155712
11.6%
A 152174
11.4%
N 145950
10.9%
H 121531
9.1%
O 99418
7.4%
C 97399
7.3%
R 93204
 
7.0%
I 85718
 
6.4%
M 76728
 
5.7%
E 67816
 
5.1%
Other values (14) 241471
18.1%
Common
ValueCountFrequency (%)
_ 128614
96.4%
; 4819
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1470554
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 155712
10.6%
A 152174
10.3%
N 145950
9.9%
_ 128614
8.7%
H 121531
 
8.3%
O 99418
 
6.8%
C 97399
 
6.6%
R 93204
 
6.3%
I 85718
 
5.8%
M 76728
 
5.2%
Other values (16) 314106
21.4%

mediaType
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing57645
Missing (%)6.9%
Memory size6.4 MiB
2025-01-08T18:39:46.689539image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length131
Median length10
Mean length10.01355316
Min length10

Characters and Unicode

Total characters7796192
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 777657
99.9%
stillimage;stillimage 883
 
0.1%
stillimage;stillimage;stillimage;stillimage 8
 
< 0.1%
stillimage;stillimage;stillimage 8
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage 6
 
< 0.1%
2024-11-01t10:28:05.946z 1
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 1
 
< 0.1%
2025-01-08T18:39:46.797068image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1559042
20.0%
S 779521
10.0%
t 779521
10.0%
i 779521
10.0%
I 779521
10.0%
m 779521
10.0%
a 779521
10.0%
g 779521
10.0%
e 779521
10.0%
; 958
 
< 0.1%
Other values (13) 24
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6236168
80.0%
Uppercase Letter 1559044
 
20.0%
Other Punctuation 961
 
< 0.1%
Decimal Number 17
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4
23.5%
0 4
23.5%
2 3
17.6%
4 2
11.8%
8 1
 
5.9%
5 1
 
5.9%
9 1
 
5.9%
6 1
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
l 1559042
25.0%
t 779521
12.5%
i 779521
12.5%
m 779521
12.5%
a 779521
12.5%
g 779521
12.5%
e 779521
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 779521
50.0%
I 779521
50.0%
T 1
 
< 0.1%
Z 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 958
99.7%
: 2
 
0.2%
. 1
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7795212
> 99.9%
Common 980
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
; 958
97.8%
1 4
 
0.4%
0 4
 
0.4%
2 3
 
0.3%
4 2
 
0.2%
- 2
 
0.2%
: 2
 
0.2%
8 1
 
0.1%
5 1
 
0.1%
. 1
 
0.1%
Other values (2) 2
 
0.2%
Latin
ValueCountFrequency (%)
l 1559042
20.0%
S 779521
10.0%
t 779521
10.0%
i 779521
10.0%
I 779521
10.0%
m 779521
10.0%
a 779521
10.0%
g 779521
10.0%
e 779521
10.0%
T 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7796192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1559042
20.0%
S 779521
10.0%
t 779521
10.0%
i 779521
10.0%
I 779521
10.0%
m 779521
10.0%
a 779521
10.0%
g 779521
10.0%
e 779521
10.0%
; 958
 
< 0.1%
Other values (13) 24
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:46.841641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length5
Mean length4.577694784
Min length4

Characters and Unicode

Total characters3827905
Distinct characters21
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowfalse
2nd rowtrue
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 483053
57.8%
true 353154
42.2%
2024-11-01t08:50:07.799z 1
 
< 0.1%
2025-01-08T18:39:46.941987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 836207
21.8%
f 483053
12.6%
l 483053
12.6%
s 483053
12.6%
a 483053
12.6%
t 353154
9.2%
r 353154
9.2%
u 353154
9.2%
0 5
 
< 0.1%
1 3
 
< 0.1%
Other values (11) 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3827881
> 99.9%
Decimal Number 17
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 836207
21.8%
f 483053
12.6%
l 483053
12.6%
s 483053
12.6%
a 483053
12.6%
t 353154
9.2%
r 353154
9.2%
u 353154
9.2%
Decimal Number
ValueCountFrequency (%)
0 5
29.4%
1 3
17.6%
7 2
 
11.8%
2 2
 
11.8%
9 2
 
11.8%
4 1
 
5.9%
5 1
 
5.9%
8 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3827883
> 99.9%
Common 22
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5
22.7%
1 3
13.6%
7 2
 
9.1%
2 2
 
9.1%
- 2
 
9.1%
9 2
 
9.1%
: 2
 
9.1%
. 1
 
4.5%
4 1
 
4.5%
5 1
 
4.5%
Latin
ValueCountFrequency (%)
e 836207
21.8%
f 483053
12.6%
l 483053
12.6%
s 483053
12.6%
a 483053
12.6%
t 353154
9.2%
r 353154
9.2%
u 353154
9.2%
T 1
 
< 0.1%
Z 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3827905
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 836207
21.8%
f 483053
12.6%
l 483053
12.6%
s 483053
12.6%
a 483053
12.6%
t 353154
9.2%
r 353154
9.2%
u 353154
9.2%
0 5
 
< 0.1%
1 3
 
< 0.1%
Other values (11) 16
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:46.983046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.996380087
Min length4

Characters and Unicode

Total characters4178013
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 833181
99.6%
true 3027
 
0.4%
2025-01-08T18:39:47.071892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 836208
20.0%
f 833181
19.9%
a 833181
19.9%
l 833181
19.9%
s 833181
19.9%
t 3027
 
0.1%
r 3027
 
0.1%
u 3027
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4178013
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 836208
20.0%
f 833181
19.9%
a 833181
19.9%
l 833181
19.9%
s 833181
19.9%
t 3027
 
0.1%
r 3027
 
0.1%
u 3027
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4178013
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 836208
20.0%
f 833181
19.9%
a 833181
19.9%
l 833181
19.9%
s 833181
19.9%
t 3027
 
0.1%
r 3027
 
0.1%
u 3027
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4178013
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 836208
20.0%
f 833181
19.9%
a 833181
19.9%
l 833181
19.9%
s 833181
19.9%
t 3027
 
0.1%
r 3027
 
0.1%
u 3027
 
0.1%
Distinct160036
Distinct (%)19.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:47.284680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.959800624
Min length1

Characters and Unicode

Total characters5819834
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76666 ?
Unique (%)9.2%

Sample

1st row3189695
2nd row4097456
3rd row3189695
4th row5284426
5th row3189695
ValueCountFrequency (%)
6 3615
 
0.4%
11238428 1484
 
0.2%
3177662 1278
 
0.2%
2919963 909
 
0.1%
3189556 699
 
0.1%
3136365 627
 
0.1%
3033976 605
 
0.1%
3065 604
 
0.1%
3029010 590
 
0.1%
8798 574
 
0.1%
Other values (160026) 825222
98.7%
2025-01-08T18:39:47.571418image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 746905
12.8%
2 693044
11.9%
7 628059
10.8%
5 626227
10.8%
8 550520
9.5%
0 524218
9.0%
9 522549
9.0%
1 521280
9.0%
6 515660
8.9%
4 491372
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5819834
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 746905
12.8%
2 693044
11.9%
7 628059
10.8%
5 626227
10.8%
8 550520
9.5%
0 524218
9.0%
9 522549
9.0%
1 521280
9.0%
6 515660
8.9%
4 491372
8.4%

Most occurring scripts

ValueCountFrequency (%)
Common 5819834
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 746905
12.8%
2 693044
11.9%
7 628059
10.8%
5 626227
10.8%
8 550520
9.5%
0 524218
9.0%
9 522549
9.0%
1 521280
9.0%
6 515660
8.9%
4 491372
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5819834
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 746905
12.8%
2 693044
11.9%
7 628059
10.8%
5 626227
10.8%
8 550520
9.5%
0 524218
9.0%
9 522549
9.0%
1 521280
9.0%
6 515660
8.9%
4 491372
8.4%
Distinct126970
Distinct (%)15.2%
Missing48
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:47.791467image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.962477322
Min length1

Characters and Unicode

Total characters5821752
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51536 ?
Unique (%)6.2%

Sample

1st row3189695
2nd row4097456
3rd row3189695
4th row5284426
5th row3189695
ValueCountFrequency (%)
6 3615
 
0.4%
329 1484
 
0.2%
3177662 1278
 
0.2%
2919963 968
 
0.1%
9458333 756
 
0.1%
3189556 710
 
0.1%
3061139 634
 
0.1%
3029010 607
 
0.1%
3033976 605
 
0.1%
3065 604
 
0.1%
Other values (126960) 824900
98.7%
2025-01-08T18:39:48.073782image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5821752
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

Most occurring scripts

ValueCountFrequency (%)
Common 5821752
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5821752
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 742621
12.8%
2 701754
12.1%
7 632660
10.9%
5 617498
10.6%
8 540994
9.3%
1 531253
9.1%
0 530978
9.1%
9 526129
9.0%
6 511504
8.8%
4 486361
8.4%
Distinct8
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:48.132717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.000004783
Min length1

Characters and Unicode

Total characters836212
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row6
2nd row6
3rd row6
4th row6
5th row6
ValueCountFrequency (%)
6 810590
96.9%
5 16418
 
2.0%
4 6571
 
0.8%
3 2508
 
0.3%
7 73
 
< 0.1%
0 46
 
< 0.1%
false 1
 
< 0.1%
1 1
 
< 0.1%
2025-01-08T18:39:48.232959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 810590
96.9%
5 16418
 
2.0%
4 6571
 
0.8%
3 2508
 
0.3%
7 73
 
< 0.1%
0 46
 
< 0.1%
f 1
 
< 0.1%
a 1
 
< 0.1%
l 1
 
< 0.1%
s 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 836207
> 99.9%
Lowercase Letter 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 810590
96.9%
5 16418
 
2.0%
4 6571
 
0.8%
3 2508
 
0.3%
7 73
 
< 0.1%
0 46
 
< 0.1%
1 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common 836207
> 99.9%
Latin 5
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
6 810590
96.9%
5 16418
 
2.0%
4 6571
 
0.8%
3 2508
 
0.3%
7 73
 
< 0.1%
0 46
 
< 0.1%
1 1
 
< 0.1%
Latin
ValueCountFrequency (%)
f 1
20.0%
a 1
20.0%
l 1
20.0%
s 1
20.0%
e 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 836212
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 810590
96.9%
5 16418
 
2.0%
4 6571
 
0.8%
3 2508
 
0.3%
7 73
 
< 0.1%
0 46
 
< 0.1%
f 1
 
< 0.1%
a 1
 
< 0.1%
l 1
 
< 0.1%
s 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Distinct25
Distinct (%)< 0.1%
Missing4178
Missing (%)0.5%
Memory size6.4 MiB
2025-01-08T18:39:48.282961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.669377703
Min length1

Characters and Unicode

Total characters5549129
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row7707728
2nd row7707728
3rd row7707728
4th row7707728
5th row7707728
ValueCountFrequency (%)
7707728 772865
92.9%
106 12168
 
1.5%
35 10128
 
1.2%
34 8611
 
1.0%
36 7692
 
0.9%
95 7630
 
0.9%
98 6511
 
0.8%
68 2493
 
0.3%
7819616 2034
 
0.2%
9 1729
 
0.2%
Other values (15) 170
 
< 0.1%
2025-01-08T18:39:48.390442image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3093521
55.7%
0 785051
 
14.1%
8 783924
 
14.1%
2 772945
 
13.9%
3 26642
 
0.5%
6 26422
 
0.5%
9 17930
 
0.3%
5 17769
 
0.3%
1 16296
 
0.3%
4 8623
 
0.2%
Other values (5) 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5549123
> 99.9%
Uppercase Letter 6
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3093521
55.7%
0 785051
 
14.1%
8 783924
 
14.1%
2 772945
 
13.9%
3 26642
 
0.5%
6 26422
 
0.5%
9 17930
 
0.3%
5 17769
 
0.3%
1 16296
 
0.3%
4 8623
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
E 2
33.3%
U 1
16.7%
R 1
16.7%
O 1
16.7%
P 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 5549123
> 99.9%
Latin 6
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3093521
55.7%
0 785051
 
14.1%
8 783924
 
14.1%
2 772945
 
13.9%
3 26642
 
0.5%
6 26422
 
0.5%
9 17930
 
0.3%
5 17769
 
0.3%
1 16296
 
0.3%
4 8623
 
0.2%
Latin
ValueCountFrequency (%)
E 2
33.3%
U 1
16.7%
R 1
16.7%
O 1
16.7%
P 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5549129
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3093521
55.7%
0 785051
 
14.1%
8 783924
 
14.1%
2 772945
 
13.9%
3 26642
 
0.5%
6 26422
 
0.5%
9 17930
 
0.3%
5 17769
 
0.3%
1 16296
 
0.3%
4 8623
 
0.2%
Other values (5) 6
 
< 0.1%
Distinct76
Distinct (%)< 0.1%
Missing4394
Missing (%)0.5%
Memory size6.4 MiB
2025-01-08T18:39:48.448942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.232160997
Min length3

Characters and Unicode

Total characters2688560
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st row220
2nd row220
3rd row220
4th row194
5th row220
ValueCountFrequency (%)
220 602481
72.4%
196 124141
 
14.9%
7228684 37585
 
4.5%
342 11412
 
1.4%
327 9385
 
1.1%
186 8120
 
1.0%
7073593 5400
 
0.6%
195 5312
 
0.6%
180 4420
 
0.5%
245 3965
 
0.5%
Other values (66) 19594
 
2.4%
2025-01-08T18:39:48.562176image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1314439
48.9%
0 617260
23.0%
6 175553
 
6.5%
1 156948
 
5.8%
9 144675
 
5.4%
8 92312
 
3.4%
7 67617
 
2.5%
4 61556
 
2.3%
3 42529
 
1.6%
5 15665
 
0.6%
Other values (5) 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2688554
> 99.9%
Uppercase Letter 6
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1314439
48.9%
0 617260
23.0%
6 175553
 
6.5%
1 156948
 
5.8%
9 144675
 
5.4%
8 92312
 
3.4%
7 67617
 
2.5%
4 61556
 
2.3%
3 42529
 
1.6%
5 15665
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
E 2
33.3%
U 1
16.7%
R 1
16.7%
O 1
16.7%
P 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 2688554
> 99.9%
Latin 6
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1314439
48.9%
0 617260
23.0%
6 175553
 
6.5%
1 156948
 
5.8%
9 144675
 
5.4%
8 92312
 
3.4%
7 67617
 
2.5%
4 61556
 
2.3%
3 42529
 
1.6%
5 15665
 
0.6%
Latin
ValueCountFrequency (%)
E 2
33.3%
U 1
16.7%
R 1
16.7%
O 1
16.7%
P 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2688560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1314439
48.9%
0 617260
23.0%
6 175553
 
6.5%
1 156948
 
5.8%
9 144675
 
5.4%
8 92312
 
3.4%
7 67617
 
2.5%
4 61556
 
2.3%
3 42529
 
1.6%
5 15665
 
0.6%
Other values (5) 6
 
< 0.1%
Distinct379
Distinct (%)< 0.1%
Missing6630
Missing (%)0.8%
Memory size6.4 MiB
2025-01-08T18:39:48.729183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.671659962
Min length3

Characters and Unicode

Total characters3045932
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)< 0.1%

Sample

1st row408
2nd row941
3rd row408
4th row640
5th row408
ValueCountFrequency (%)
1369 73718
 
8.9%
414 57572
 
6.9%
1414 56399
 
6.8%
1370 55446
 
6.7%
408 55104
 
6.6%
412 52371
 
6.3%
691 40401
 
4.9%
1353 30937
 
3.7%
422 30623
 
3.7%
392 28749
 
3.5%
Other values (369) 348259
42.0%
2025-01-08T18:39:48.961046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 658148
21.6%
4 499634
16.4%
3 360783
11.8%
9 328336
10.8%
2 299776
9.8%
6 252940
 
8.3%
0 216746
 
7.1%
7 171731
 
5.6%
5 160152
 
5.3%
8 97686
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3045932
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 658148
21.6%
4 499634
16.4%
3 360783
11.8%
9 328336
10.8%
2 299776
9.8%
6 252940
 
8.3%
0 216746
 
7.1%
7 171731
 
5.6%
5 160152
 
5.3%
8 97686
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 3045932
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 658148
21.6%
4 499634
16.4%
3 360783
11.8%
9 328336
10.8%
2 299776
9.8%
6 252940
 
8.3%
0 216746
 
7.1%
7 171731
 
5.6%
5 160152
 
5.3%
8 97686
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3045932
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 658148
21.6%
4 499634
16.4%
3 360783
11.8%
9 328336
10.8%
2 299776
9.8%
6 252940
 
8.3%
0 216746
 
7.1%
7 171731
 
5.6%
5 160152
 
5.3%
8 97686
 
3.2%
Distinct1416
Distinct (%)0.2%
Missing6696
Missing (%)0.8%
Memory size6.4 MiB
2025-01-08T18:39:49.143147image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.147109207
Min length4

Characters and Unicode

Total characters3440081
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique175 ?
Unique (%)< 0.1%

Sample

1st row2420
2nd row6645
3rd row2420
4th row3924
5th row2420
ValueCountFrequency (%)
5386 52179
 
6.3%
3065 51696
 
6.2%
3073 43659
 
5.3%
8798 32694
 
3.9%
7708 22275
 
2.7%
2497 20240
 
2.4%
5015 19433
 
2.3%
7689 15993
 
1.9%
4691 15181
 
1.8%
6685 13704
 
1.7%
Other values (1406) 542459
65.4%
2025-01-08T18:39:49.480651image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 584173
17.0%
3 419239
12.2%
8 373106
10.8%
7 361654
10.5%
2 327882
9.5%
0 316208
9.2%
5 303481
8.8%
4 271279
7.9%
9 248947
7.2%
1 234112
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3440081
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 584173
17.0%
3 419239
12.2%
8 373106
10.8%
7 361654
10.5%
2 327882
9.5%
0 316208
9.2%
5 303481
8.8%
4 271279
7.9%
9 248947
7.2%
1 234112
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 3440081
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 584173
17.0%
3 419239
12.2%
8 373106
10.8%
7 361654
10.5%
2 327882
9.5%
0 316208
9.2%
5 303481
8.8%
4 271279
7.9%
9 248947
7.2%
1 234112
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3440081
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 584173
17.0%
3 419239
12.2%
8 373106
10.8%
7 361654
10.5%
2 327882
9.5%
0 316208
9.2%
5 303481
8.8%
4 271279
7.9%
9 248947
7.2%
1 234112
6.8%

genusKey
Text

Missing 

Distinct14164
Distinct (%)1.7%
Missing13165
Missing (%)1.6%
Memory size6.4 MiB
2025-01-08T18:39:49.686492image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.020595497
Min length7

Characters and Unicode

Total characters5778259
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2534 ?
Unique (%)0.3%

Sample

1st row3189695
2nd row10803341
3rd row3189695
4th row2685008
5th row3189695
ValueCountFrequency (%)
2721893 9711
 
1.2%
2984588 7339
 
0.9%
2988638 6530
 
0.8%
7787708 5116
 
0.6%
2713455 4059
 
0.5%
3039576 3696
 
0.4%
3033294 3488
 
0.4%
2913027 3355
 
0.4%
11397237 3348
 
0.4%
2650583 3302
 
0.4%
Other values (14154) 773100
93.9%
2025-01-08T18:39:49.945754image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 817492
14.1%
3 775895
13.4%
7 612005
10.6%
9 570180
9.9%
8 565886
9.8%
1 546714
9.5%
0 542830
9.4%
6 485847
8.4%
5 463136
8.0%
4 398274
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5778259
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 817492
14.1%
3 775895
13.4%
7 612005
10.6%
9 570180
9.9%
8 565886
9.8%
1 546714
9.5%
0 542830
9.4%
6 485847
8.4%
5 463136
8.0%
4 398274
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common 5778259
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 817492
14.1%
3 775895
13.4%
7 612005
10.6%
9 570180
9.9%
8 565886
9.8%
1 546714
9.5%
0 542830
9.4%
6 485847
8.4%
5 463136
8.0%
4 398274
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5778259
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 817492
14.1%
3 775895
13.4%
7 612005
10.6%
9 570180
9.9%
8 565886
9.8%
1 546714
9.5%
0 542830
9.4%
6 485847
8.4%
5 463136
8.0%
4 398274
6.9%

speciesKey
Text

Missing 

Distinct111719
Distinct (%)14.7%
Missing78171
Missing (%)9.3%
Memory size6.4 MiB
2025-01-08T18:39:50.165723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.020400033
Min length7

Characters and Unicode

Total characters5321730
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44299 ?
Unique (%)5.8%

Sample

1st row4097456
2nd row5284426
3rd row4096206
4th row2886544
5th row3189723
ValueCountFrequency (%)
9458333 757
 
0.1%
7558421 511
 
0.1%
9364157 471
 
0.1%
2810155 420
 
0.1%
2704922 413
 
0.1%
8179794 383
 
0.1%
2882482 380
 
0.1%
2975014 376
 
< 0.1%
2913130 373
 
< 0.1%
5350452 354
 
< 0.1%
Other values (111709) 753600
99.4%
2025-01-08T18:39:50.445483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 682173
12.8%
2 628214
11.8%
5 581954
10.9%
7 569005
10.7%
8 499741
9.4%
1 487025
9.2%
0 485699
9.1%
9 476907
9.0%
4 456796
8.6%
6 454216
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5321730
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 682173
12.8%
2 628214
11.8%
5 581954
10.9%
7 569005
10.7%
8 499741
9.4%
1 487025
9.2%
0 485699
9.1%
9 476907
9.0%
4 456796
8.6%
6 454216
8.5%

Most occurring scripts

ValueCountFrequency (%)
Common 5321730
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 682173
12.8%
2 628214
11.8%
5 581954
10.9%
7 569005
10.7%
8 499741
9.4%
1 487025
9.2%
0 485699
9.1%
9 476907
9.0%
4 456796
8.6%
6 454216
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5321730
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 682173
12.8%
2 628214
11.8%
5 581954
10.9%
7 569005
10.7%
8 499741
9.4%
1 487025
9.2%
0 485699
9.1%
9 476907
9.0%
4 456796
8.6%
6 454216
8.5%

species
Text

Missing 

Distinct111366
Distinct (%)14.7%
Missing78171
Missing (%)9.3%
Memory size6.4 MiB
2025-01-08T18:39:50.639409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length32
Mean length18.62692635
Min length8

Characters and Unicode

Total characters14119918
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44055 ?
Unique (%)5.8%

Sample

1st rowShorea platycarpa
2nd rowAgathis borneensis
3rd rowShorea hopeifolia
4th rowPalaquium hexandrum
5th rowPlantago ovata
ValueCountFrequency (%)
carex 9451
 
0.6%
ficus 7087
 
0.5%
rubus 6054
 
0.4%
taraxacum 4923
 
0.3%
vulgaris 4105
 
0.3%
cyperus 3670
 
0.2%
ranunculus 3401
 
0.2%
salix 3385
 
0.2%
galium 3222
 
0.2%
euphorbia 3160
 
0.2%
Other values (49526) 1467762
96.8%
2025-01-08T18:39:50.889312image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1748365
 
12.4%
i 1353758
 
9.6%
e 927246
 
6.6%
r 886879
 
6.3%
s 873334
 
6.2%
o 829205
 
5.9%
l 796707
 
5.6%
u 784536
 
5.6%
758182
 
5.4%
n 758181
 
5.4%
Other values (45) 4403525
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12598753
89.2%
Space Separator 758182
 
5.4%
Uppercase Letter 758103
 
5.4%
Dash Punctuation 4861
 
< 0.1%
Math Symbol 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1748365
13.9%
i 1353758
10.7%
e 927246
 
7.4%
r 886879
 
7.0%
s 873334
 
6.9%
o 829205
 
6.6%
l 796707
 
6.3%
u 784536
 
6.2%
n 758181
 
6.0%
t 627646
 
5.0%
Other values (16) 3012896
23.9%
Uppercase Letter
ValueCountFrequency (%)
C 103470
13.6%
P 79067
 
10.4%
S 73083
 
9.6%
A 72797
 
9.6%
M 44569
 
5.9%
L 40379
 
5.3%
D 40075
 
5.3%
T 37129
 
4.9%
E 35245
 
4.6%
G 33655
 
4.4%
Other values (16) 198634
26.2%
Space Separator
ValueCountFrequency (%)
758182
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4861
100.0%
Math Symbol
ValueCountFrequency (%)
× 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13356856
94.6%
Common 763062
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1748365
13.1%
i 1353758
 
10.1%
e 927246
 
6.9%
r 886879
 
6.6%
s 873334
 
6.5%
o 829205
 
6.2%
l 796707
 
6.0%
u 784536
 
5.9%
n 758181
 
5.7%
t 627646
 
4.7%
Other values (42) 3770999
28.2%
Common
ValueCountFrequency (%)
758182
99.4%
- 4861
 
0.6%
× 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14119899
> 99.9%
None 19
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1748365
 
12.4%
i 1353758
 
9.6%
e 927246
 
6.6%
r 886879
 
6.3%
s 873334
 
6.2%
o 829205
 
5.9%
l 796707
 
5.6%
u 784536
 
5.6%
758182
 
5.4%
n 758181
 
5.4%
Other values (44) 4403506
31.2%
None
ValueCountFrequency (%)
× 19
100.0%
Distinct126970
Distinct (%)15.2%
Missing48
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:51.093912image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length152
Median length95
Mean length29.51563515
Min length5

Characters and Unicode

Total characters24679823
Distinct characters122
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51536 ?
Unique (%)6.2%

Sample

1st rowPlantago L.
2nd rowShorea platycarpa F.Heim
3rd rowPlantago L.
4th rowAgathis borneensis Warb.
5th rowPlantago L.
ValueCountFrequency (%)
l 230364
 
7.4%
85958
 
2.8%
ex 51904
 
1.7%
subsp 38063
 
1.2%
blume 32827
 
1.1%
var 17559
 
0.6%
dc 17352
 
0.6%
benth 14051
 
0.5%
miq 11890
 
0.4%
willd 10347
 
0.3%
Other values (63531) 2597773
83.6%
2025-01-08T18:39:51.366830image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2271927
 
9.2%
a 2240310
 
9.1%
i 1724481
 
7.0%
e 1527392
 
6.2%
r 1349022
 
5.5%
l 1227513
 
5.0%
s 1201503
 
4.9%
o 1166198
 
4.7%
. 1157405
 
4.7%
n 1097269
 
4.4%
Other values (112) 9716803
39.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18079514
73.3%
Uppercase Letter 2406058
 
9.7%
Space Separator 2271927
 
9.2%
Other Punctuation 1266100
 
5.1%
Close Punctuation 298092
 
1.2%
Open Punctuation 298092
 
1.2%
Decimal Number 42748
 
0.2%
Dash Punctuation 13201
 
0.1%
Math Symbol 4070
 
< 0.1%
Connector Punctuation 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2240310
12.4%
i 1724481
 
9.5%
e 1527392
 
8.4%
r 1349022
 
7.5%
l 1227513
 
6.8%
s 1201503
 
6.6%
o 1166198
 
6.5%
n 1097269
 
6.1%
u 1091448
 
6.0%
t 891929
 
4.9%
Other values (54) 4562449
25.2%
Uppercase Letter
ValueCountFrequency (%)
L 340209
14.1%
C 201995
 
8.4%
S 200872
 
8.3%
B 180383
 
7.5%
P 157094
 
6.5%
M 155909
 
6.5%
A 146316
 
6.1%
H 129394
 
5.4%
D 119052
 
4.9%
R 117798
 
4.9%
Other values (28) 657036
27.3%
Decimal Number
ValueCountFrequency (%)
1 12439
29.1%
8 8588
20.1%
9 4589
 
10.7%
2 2962
 
6.9%
7 2867
 
6.7%
3 2823
 
6.6%
0 2622
 
6.1%
4 2565
 
6.0%
5 1962
 
4.6%
6 1331
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 1157405
91.4%
& 85958
 
6.8%
, 21063
 
1.7%
' 1674
 
0.1%
Space Separator
ValueCountFrequency (%)
2271927
100.0%
Close Punctuation
ValueCountFrequency (%)
) 298092
100.0%
Open Punctuation
ValueCountFrequency (%)
( 298092
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13201
100.0%
Math Symbol
ValueCountFrequency (%)
× 4070
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20485572
83.0%
Common 4194251
 
17.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2240310
 
10.9%
i 1724481
 
8.4%
e 1527392
 
7.5%
r 1349022
 
6.6%
l 1227513
 
6.0%
s 1201503
 
5.9%
o 1166198
 
5.7%
n 1097269
 
5.4%
u 1091448
 
5.3%
t 891929
 
4.4%
Other values (92) 6968507
34.0%
Common
ValueCountFrequency (%)
2271927
54.2%
. 1157405
27.6%
) 298092
 
7.1%
( 298092
 
7.1%
& 85958
 
2.0%
, 21063
 
0.5%
- 13201
 
0.3%
1 12439
 
0.3%
8 8588
 
0.2%
9 4589
 
0.1%
Other values (10) 22897
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24641457
99.8%
None 38366
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2271927
 
9.2%
a 2240310
 
9.1%
i 1724481
 
7.0%
e 1527392
 
6.2%
r 1349022
 
5.5%
l 1227513
 
5.0%
s 1201503
 
4.9%
o 1166198
 
4.7%
. 1157405
 
4.7%
n 1097269
 
4.5%
Other values (61) 9678437
39.3%
None
ValueCountFrequency (%)
ü 13590
35.4%
é 7617
19.9%
× 4070
 
10.6%
ö 3266
 
8.5%
á 2018
 
5.3%
ä 1595
 
4.2%
ó 1102
 
2.9%
è 637
 
1.7%
Á 609
 
1.6%
ø 593
 
1.5%
Other values (41) 3269
 
8.5%
Distinct175339
Distinct (%)21.0%
Missing45
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:51.577988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length101
Median length84
Mean length28.43354175
Min length3

Characters and Unicode

Total characters23775104
Distinct characters118
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90006 ?
Unique (%)10.8%

Sample

1st rowPlantago psyllium L.
2nd rowShorea platycarpa Heim
3rd rowPlantago psyllium L.
4th rowAgathis borneensis Warb.
5th rowPlantago psyllium L.
ValueCountFrequency (%)
l 208156
 
7.0%
59763
 
2.0%
ex 42677
 
1.4%
var 39274
 
1.3%
blume 30277
 
1.0%
subsp 26912
 
0.9%
dc 18463
 
0.6%
benth 14237
 
0.5%
indet 12644
 
0.4%
miq 12572
 
0.4%
Other values (72954) 2530009
84.5%
2025-01-08T18:39:51.846558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2207594
 
9.3%
2158927
 
9.1%
i 1707901
 
7.2%
e 1477517
 
6.2%
r 1333444
 
5.6%
l 1185137
 
5.0%
s 1163476
 
4.9%
o 1123360
 
4.7%
u 1075965
 
4.5%
. 1074132
 
4.5%
Other values (108) 9267651
39.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17714410
74.5%
Uppercase Letter 2225741
 
9.4%
Space Separator 2158927
 
9.1%
Other Punctuation 1149367
 
4.8%
Open Punctuation 256229
 
1.1%
Close Punctuation 256228
 
1.1%
Dash Punctuation 11272
 
< 0.1%
Math Symbol 1535
 
< 0.1%
Decimal Number 1395
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2207594
12.5%
i 1707901
 
9.6%
e 1477517
 
8.3%
r 1333444
 
7.5%
l 1185137
 
6.7%
s 1163476
 
6.6%
o 1123360
 
6.3%
u 1075965
 
6.1%
n 1072354
 
6.1%
t 884232
 
5.0%
Other values (44) 4483430
25.3%
Uppercase Letter
ValueCountFrequency (%)
L 306900
13.8%
C 198352
 
8.9%
S 188310
 
8.5%
B 169967
 
7.6%
M 141899
 
6.4%
P 141155
 
6.3%
A 140243
 
6.3%
H 122694
 
5.5%
D 113713
 
5.1%
R 103647
 
4.7%
Other values (24) 598861
26.9%
Other Punctuation
ValueCountFrequency (%)
. 1074132
93.5%
& 59726
 
5.2%
' 12741
 
1.1%
, 2682
 
0.2%
" 49
 
< 0.1%
? 25
 
< 0.1%
! 8
 
< 0.1%
/ 1
 
< 0.1%
1
 
< 0.1%
; 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 947
67.9%
2 274
 
19.6%
4 32
 
2.3%
3 29
 
2.1%
0 29
 
2.1%
6 23
 
1.6%
7 21
 
1.5%
8 15
 
1.1%
5 14
 
1.0%
9 11
 
0.8%
Math Symbol
ValueCountFrequency (%)
× 1532
99.8%
+ 2
 
0.1%
= 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 255077
99.6%
[ 1152
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 255076
99.6%
] 1152
 
0.4%
Space Separator
ValueCountFrequency (%)
2158927
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11272
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19940151
83.9%
Common 3834953
 
16.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2207594
 
11.1%
i 1707901
 
8.6%
e 1477517
 
7.4%
r 1333444
 
6.7%
l 1185137
 
5.9%
s 1163476
 
5.8%
o 1123360
 
5.6%
u 1075965
 
5.4%
n 1072354
 
5.4%
t 884232
 
4.4%
Other values (78) 6709171
33.6%
Common
ValueCountFrequency (%)
2158927
56.3%
. 1074132
28.0%
( 255077
 
6.7%
) 255076
 
6.7%
& 59726
 
1.6%
' 12741
 
0.3%
- 11272
 
0.3%
, 2682
 
0.1%
× 1532
 
< 0.1%
] 1152
 
< 0.1%
Other values (20) 2636
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23744990
99.9%
None 30113
 
0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2207594
 
9.3%
2158927
 
9.1%
i 1707901
 
7.2%
e 1477517
 
6.2%
r 1333444
 
5.6%
l 1185137
 
5.0%
s 1163476
 
4.9%
o 1123360
 
4.7%
u 1075965
 
4.5%
. 1074132
 
4.5%
Other values (70) 9237537
38.9%
None
ValueCountFrequency (%)
ü 13903
46.2%
é 7175
23.8%
ö 2297
 
7.6%
× 1532
 
5.1%
ä 1135
 
3.8%
ó 807
 
2.7%
á 802
 
2.7%
è 641
 
2.1%
ø 519
 
1.7%
ê 146
 
0.5%
Other values (27) 1156
 
3.8%
Punctuation
ValueCountFrequency (%)
1
100.0%

typifiedName
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing836208
Missing (%)> 99.9%
Memory size6.4 MiB
2025-01-08T18:39:51.896987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNE
ValueCountFrequency (%)
ne 1
100.0%
2025-01-08T18:39:51.983579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1
50.0%
E 1
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
E 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1
50.0%
E 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1
50.0%
E 1
50.0%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:52.026371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters9198277
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDWC_ARCHIVE
2nd rowDWC_ARCHIVE
3rd rowDWC_ARCHIVE
4th rowDWC_ARCHIVE
5th rowDWC_ARCHIVE
ValueCountFrequency (%)
dwc_archive 836207
100.0%
2025-01-08T18:39:52.118073image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1672414
18.2%
D 836207
9.1%
W 836207
9.1%
_ 836207
9.1%
A 836207
9.1%
R 836207
9.1%
H 836207
9.1%
I 836207
9.1%
V 836207
9.1%
E 836207
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8362070
90.9%
Connector Punctuation 836207
 
9.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 1672414
20.0%
D 836207
10.0%
W 836207
10.0%
A 836207
10.0%
R 836207
10.0%
H 836207
10.0%
I 836207
10.0%
V 836207
10.0%
E 836207
10.0%
Connector Punctuation
ValueCountFrequency (%)
_ 836207
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8362070
90.9%
Common 836207
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 1672414
20.0%
D 836207
10.0%
W 836207
10.0%
A 836207
10.0%
R 836207
10.0%
H 836207
10.0%
I 836207
10.0%
V 836207
10.0%
E 836207
10.0%
Common
ValueCountFrequency (%)
_ 836207
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9198277
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1672414
18.2%
D 836207
9.1%
W 836207
9.1%
_ 836207
9.1%
A 836207
9.1%
R 836207
9.1%
H 836207
9.1%
I 836207
9.1%
V 836207
9.1%
E 836207
9.1%
Distinct151288
Distinct (%)18.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:52.242547image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99604404
Min length20

Characters and Unicode

Total characters20065660
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19524 ?
Unique (%)2.3%

Sample

1st row2024-11-01T10:27:16.300Z
2nd row2024-11-01T10:29:04.857Z
3rd row2024-11-01T10:27:16.301Z
4th row2024-11-01T10:29:41.603Z
5th row2024-11-01T10:27:17.382Z
ValueCountFrequency (%)
2024-11-01t10:27:17.419z 32
 
< 0.1%
2024-11-01t10:26:47.509z 30
 
< 0.1%
2024-11-01t10:27:17.556z 29
 
< 0.1%
2024-11-01t10:27:17.502z 28
 
< 0.1%
2024-11-01t10:28:04.529z 28
 
< 0.1%
2024-11-01t10:27:28.429z 28
 
< 0.1%
2024-11-01t10:27:27.691z 28
 
< 0.1%
2024-11-01t10:27:17.495z 28
 
< 0.1%
2024-11-01t10:27:16.167z 28
 
< 0.1%
2024-11-01t10:27:02.841z 28
 
< 0.1%
Other values (151278) 835920
> 99.9%
2025-01-08T18:39:52.438416image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3864670
19.3%
0 3022309
15.1%
2 2963708
14.8%
- 1672414
8.3%
: 1672414
8.3%
4 1297614
 
6.5%
T 836207
 
4.2%
Z 836207
 
4.2%
. 835380
 
4.2%
7 673118
 
3.4%
Other values (5) 2391619
11.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14213038
70.8%
Other Punctuation 2507794
 
12.5%
Dash Punctuation 1672414
 
8.3%
Uppercase Letter 1672414
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3864670
27.2%
0 3022309
21.3%
2 2963708
20.9%
4 1297614
 
9.1%
7 673118
 
4.7%
8 599914
 
4.2%
9 470980
 
3.3%
5 457024
 
3.2%
3 449982
 
3.2%
6 413719
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 1672414
66.7%
. 835380
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 836207
50.0%
Z 836207
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1672414
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18393246
91.7%
Latin 1672414
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3864670
21.0%
0 3022309
16.4%
2 2963708
16.1%
- 1672414
9.1%
: 1672414
9.1%
4 1297614
 
7.1%
. 835380
 
4.5%
7 673118
 
3.7%
8 599914
 
3.3%
9 470980
 
2.6%
Other values (3) 1320725
 
7.2%
Latin
ValueCountFrequency (%)
T 836207
50.0%
Z 836207
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20065660
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3864670
19.3%
0 3022309
15.1%
2 2963708
14.8%
- 1672414
8.3%
: 1672414
8.3%
4 1297614
 
6.5%
T 836207
 
4.2%
Z 836207
 
4.2%
. 835380
 
4.2%
7 673118
 
3.4%
Other values (5) 2391619
11.9%

lastCrawled
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:52.499416image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters20068968
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-11-01T08:50:07.799Z
2nd row2024-11-01T08:50:07.799Z
3rd row2024-11-01T08:50:07.799Z
4th row2024-11-01T08:50:07.799Z
5th row2024-11-01T08:50:07.799Z
ValueCountFrequency (%)
2024-11-01t08:50:07.799z 836207
100.0%
2025-01-08T18:39:52.598818image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4181035
20.8%
1 2508621
12.5%
2 1672414
 
8.3%
- 1672414
 
8.3%
: 1672414
 
8.3%
7 1672414
 
8.3%
9 1672414
 
8.3%
4 836207
 
4.2%
T 836207
 
4.2%
8 836207
 
4.2%
Other values (3) 2508621
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14215519
70.8%
Other Punctuation 2508621
 
12.5%
Dash Punctuation 1672414
 
8.3%
Uppercase Letter 1672414
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4181035
29.4%
1 2508621
17.6%
2 1672414
 
11.8%
7 1672414
 
11.8%
9 1672414
 
11.8%
4 836207
 
5.9%
8 836207
 
5.9%
5 836207
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 1672414
66.7%
. 836207
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 836207
50.0%
Z 836207
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1672414
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18396554
91.7%
Latin 1672414
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4181035
22.7%
1 2508621
13.6%
2 1672414
 
9.1%
- 1672414
 
9.1%
: 1672414
 
9.1%
7 1672414
 
9.1%
9 1672414
 
9.1%
4 836207
 
4.5%
8 836207
 
4.5%
5 836207
 
4.5%
Latin
ValueCountFrequency (%)
T 836207
50.0%
Z 836207
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20068968
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4181035
20.8%
1 2508621
12.5%
2 1672414
 
8.3%
- 1672414
 
8.3%
: 1672414
 
8.3%
7 1672414
 
8.3%
9 1672414
 
8.3%
4 836207
 
4.2%
T 836207
 
4.2%
8 836207
 
4.2%
Other values (3) 2508621
12.5%
Distinct2
Distinct (%)< 0.1%
Missing2188
Missing (%)0.3%
Memory size6.4 MiB
2025-01-08T18:39:52.636808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.142760194
Min length4

Characters and Unicode

Total characters3455149
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtrue
2nd rowtrue
3rd rowtrue
4th rowtrue
5th rowtrue
ValueCountFrequency (%)
true 714956
85.7%
false 119065
 
14.3%
2025-01-08T18:39:52.729944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 834021
24.1%
t 714956
20.7%
r 714956
20.7%
u 714956
20.7%
f 119065
 
3.4%
a 119065
 
3.4%
l 119065
 
3.4%
s 119065
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3455149
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 834021
24.1%
t 714956
20.7%
r 714956
20.7%
u 714956
20.7%
f 119065
 
3.4%
a 119065
 
3.4%
l 119065
 
3.4%
s 119065
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 3455149
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 834021
24.1%
t 714956
20.7%
r 714956
20.7%
u 714956
20.7%
f 119065
 
3.4%
a 119065
 
3.4%
l 119065
 
3.4%
s 119065
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3455149
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 834021
24.1%
t 714956
20.7%
r 714956
20.7%
u 714956
20.7%
f 119065
 
3.4%
a 119065
 
3.4%
l 119065
 
3.4%
s 119065
 
3.4%

isSequenced
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:52.767580image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters4181035
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 836207
100.0%
2025-01-08T18:39:52.852171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 836207
20.0%
a 836207
20.0%
l 836207
20.0%
s 836207
20.0%
e 836207
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4181035
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 836207
20.0%
a 836207
20.0%
l 836207
20.0%
s 836207
20.0%
e 836207
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4181035
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 836207
20.0%
a 836207
20.0%
l 836207
20.0%
s 836207
20.0%
e 836207
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4181035
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 836207
20.0%
a 836207
20.0%
l 836207
20.0%
s 836207
20.0%
e 836207
20.0%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing151640
Missing (%)18.1%
Memory size6.4 MiB
2025-01-08T18:39:52.902174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.565221329
Min length4

Characters and Unicode

Total characters4494347
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEUROPE
2nd rowASIA
3rd rowEUROPE
4th rowASIA
5th rowEUROPE
ValueCountFrequency (%)
asia 207299
30.3%
europe 201481
29.4%
africa 110915
16.2%
latin_america 84804
12.4%
oceania 58632
 
8.6%
north_america 21173
 
3.1%
antarctica 265
 
< 0.1%
2025-01-08T18:39:53.003627image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1051245
23.4%
I 567892
12.6%
E 567571
12.6%
R 439811
9.8%
O 281286
 
6.3%
C 276054
 
6.1%
S 207299
 
4.6%
U 201481
 
4.5%
P 201481
 
4.5%
N 164874
 
3.7%
Other values (6) 535353
11.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4388370
97.6%
Connector Punctuation 105977
 
2.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1051245
24.0%
I 567892
12.9%
E 567571
12.9%
R 439811
10.0%
O 281286
 
6.4%
C 276054
 
6.3%
S 207299
 
4.7%
U 201481
 
4.6%
P 201481
 
4.6%
N 164874
 
3.8%
Other values (5) 429376
9.8%
Connector Punctuation
ValueCountFrequency (%)
_ 105977
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4388370
97.6%
Common 105977
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1051245
24.0%
I 567892
12.9%
E 567571
12.9%
R 439811
10.0%
O 281286
 
6.4%
C 276054
 
6.3%
S 207299
 
4.7%
U 201481
 
4.6%
P 201481
 
4.6%
N 164874
 
3.8%
Other values (5) 429376
9.8%
Common
ValueCountFrequency (%)
_ 105977
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4494347
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1051245
23.4%
I 567892
12.6%
E 567571
12.6%
R 439811
9.8%
O 281286
 
6.3%
C 276054
 
6.1%
S 207299
 
4.6%
U 201481
 
4.5%
P 201481
 
4.5%
N 164874
 
3.7%
Other values (6) 535353
11.9%

publishedByGbifRegion
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size6.4 MiB
2025-01-08T18:39:53.047104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters5017242
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEUROPE
2nd rowEUROPE
3rd rowEUROPE
4th rowEUROPE
5th rowEUROPE
ValueCountFrequency (%)
europe 836207
100.0%
2025-01-08T18:39:53.130441image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1672414
33.3%
U 836207
16.7%
R 836207
16.7%
O 836207
16.7%
P 836207
16.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5017242
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1672414
33.3%
U 836207
16.7%
R 836207
16.7%
O 836207
16.7%
P 836207
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 5017242
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1672414
33.3%
U 836207
16.7%
R 836207
16.7%
O 836207
16.7%
P 836207
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5017242
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1672414
33.3%
U 836207
16.7%
R 836207
16.7%
O 836207
16.7%
P 836207
16.7%

level0Gid
Text

Missing 

Distinct213
Distinct (%)0.1%
Missing497950
Missing (%)59.5%
Memory size6.4 MiB
2025-01-08T18:39:53.278820image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1014777
Distinct characters30
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowIDN
2nd rowIDN
3rd rowIDN
4th rowIDN
5th rowIDN
ValueCountFrequency (%)
nld 95646
28.3%
idn 47409
14.0%
mys 24942
 
7.4%
tha 16522
 
4.9%
png 16302
 
4.8%
cmr 12997
 
3.8%
gab 12542
 
3.7%
phl 10183
 
3.0%
civ 8171
 
2.4%
aus 6755
 
2.0%
Other values (203) 86790
25.7%
2025-01-08T18:39:53.493680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 181334
17.9%
D 154536
15.2%
L 117053
11.5%
I 61415
 
6.1%
A 55005
 
5.4%
M 49631
 
4.9%
G 46506
 
4.6%
S 41947
 
4.1%
H 35579
 
3.5%
C 33733
 
3.3%
Other values (20) 238038
23.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1014671
> 99.9%
Decimal Number 106
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 181334
17.9%
D 154536
15.2%
L 117053
11.5%
I 61415
 
6.1%
A 55005
 
5.4%
M 49631
 
4.9%
G 46506
 
4.6%
S 41947
 
4.1%
H 35579
 
3.5%
C 33733
 
3.3%
Other values (16) 237932
23.4%
Decimal Number
ValueCountFrequency (%)
0 53
50.0%
6 48
45.3%
7 3
 
2.8%
1 2
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 1014671
> 99.9%
Common 106
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 181334
17.9%
D 154536
15.2%
L 117053
11.5%
I 61415
 
6.1%
A 55005
 
5.4%
M 49631
 
4.9%
G 46506
 
4.6%
S 41947
 
4.1%
H 35579
 
3.5%
C 33733
 
3.3%
Other values (16) 237932
23.4%
Common
ValueCountFrequency (%)
0 53
50.0%
6 48
45.3%
7 3
 
2.8%
1 2
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1014777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 181334
17.9%
D 154536
15.2%
L 117053
11.5%
I 61415
 
6.1%
A 55005
 
5.4%
M 49631
 
4.9%
G 46506
 
4.6%
S 41947
 
4.1%
H 35579
 
3.5%
C 33733
 
3.3%
Other values (20) 238038
23.5%

level0Name
Text

Missing 

Distinct213
Distinct (%)0.1%
Missing497950
Missing (%)59.5%
Memory size6.4 MiB
2025-01-08T18:39:53.668645image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length27
Mean length9.879816945
Min length4

Characters and Unicode

Total characters3341937
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowIndonesia
2nd rowIndonesia
3rd rowIndonesia
4th rowIndonesia
5th rowIndonesia
ValueCountFrequency (%)
netherlands 95646
23.0%
indonesia 47409
 
11.4%
malaysia 24942
 
6.0%
guinea 18245
 
4.4%
new 17522
 
4.2%
thailand 16522
 
4.0%
papua 16302
 
3.9%
cameroon 12997
 
3.1%
gabon 12542
 
3.0%
philippines 10183
 
2.5%
Other values (247) 143215
34.5%
2025-01-08T18:39:53.909105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 454648
13.6%
e 373810
 
11.2%
n 321392
 
9.6%
i 234110
 
7.0%
s 196756
 
5.9%
d 182043
 
5.4%
l 176460
 
5.3%
r 166938
 
5.0%
o 140867
 
4.2%
h 140773
 
4.2%
Other values (53) 954140
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2849741
85.3%
Uppercase Letter 404863
 
12.1%
Space Separator 77266
 
2.3%
Other Punctuation 9293
 
0.3%
Dash Punctuation 762
 
< 0.1%
Open Punctuation 6
 
< 0.1%
Close Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 454648
16.0%
e 373810
13.1%
n 321392
11.3%
i 234110
8.2%
s 196756
6.9%
d 182043
 
6.4%
l 176460
 
6.2%
r 166938
 
5.9%
o 140867
 
4.9%
h 140773
 
4.9%
Other values (21) 461944
16.2%
Uppercase Letter
ValueCountFrequency (%)
N 116417
28.8%
I 57503
14.2%
G 38926
 
9.6%
M 32252
 
8.0%
C 31850
 
7.9%
P 28831
 
7.1%
T 21089
 
5.2%
B 14678
 
3.6%
S 12858
 
3.2%
E 10142
 
2.5%
Other values (15) 40317
 
10.0%
Other Punctuation
ValueCountFrequency (%)
' 8171
87.9%
, 1110
 
11.9%
. 12
 
0.1%
Space Separator
ValueCountFrequency (%)
77266
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 762
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3254604
97.4%
Common 87333
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 454648
14.0%
e 373810
11.5%
n 321392
 
9.9%
i 234110
 
7.2%
s 196756
 
6.0%
d 182043
 
5.6%
l 176460
 
5.4%
r 166938
 
5.1%
o 140867
 
4.3%
h 140773
 
4.3%
Other values (46) 866807
26.6%
Common
ValueCountFrequency (%)
77266
88.5%
' 8171
 
9.4%
, 1110
 
1.3%
- 762
 
0.9%
. 12
 
< 0.1%
( 6
 
< 0.1%
) 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3332320
99.7%
None 9617
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 454648
13.6%
e 373810
 
11.2%
n 321392
 
9.6%
i 234110
 
7.0%
s 196756
 
5.9%
d 182043
 
5.5%
l 176460
 
5.3%
r 166938
 
5.0%
o 140867
 
4.2%
h 140773
 
4.2%
Other values (47) 944523
28.3%
None
ValueCountFrequency (%)
ô 8171
85.0%
é 615
 
6.4%
ç 458
 
4.8%
í 185
 
1.9%
ã 185
 
1.9%
Å 3
 
< 0.1%

level1Gid
Text

Missing 

Distinct2131
Distinct (%)0.6%
Missing499035
Missing (%)59.7%
Memory size6.4 MiB
2025-01-08T18:39:54.103391image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.506898515
Min length6

Characters and Unicode

Total characters2531131
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique318 ?
Unique (%)0.1%

Sample

1st rowIDN.30_1
2nd rowIDN.12_1
3rd rowIDN.30_1
4th rowIDN.30_1
5th rowIDN.29_1
ValueCountFrequency (%)
nld.14_1 20426
 
6.1%
nld.4_1 17955
 
5.3%
mys.13_1 12106
 
3.6%
nld.9_1 10925
 
3.2%
mys.14_1 8546
 
2.5%
nld.7_1 7831
 
2.3%
nld.10_1 7165
 
2.1%
nld.11_1 7053
 
2.1%
nld.3_1 6890
 
2.0%
idn.34_1 5494
 
1.6%
Other values (2121) 232783
69.0%
2025-01-08T18:39:54.362635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 498400
19.7%
_ 337128
13.3%
. 335630
13.3%
N 181311
 
7.2%
D 154522
 
6.1%
L 117047
 
4.6%
4 76894
 
3.0%
I 61413
 
2.4%
2 59174
 
2.3%
3 56308
 
2.2%
Other values (28) 653304
25.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1011554
40.0%
Decimal Number 846819
33.5%
Connector Punctuation 337128
 
13.3%
Other Punctuation 335630
 
13.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 181311
17.9%
D 154522
15.3%
L 117047
11.6%
I 61413
 
6.1%
A 54659
 
5.4%
M 49326
 
4.9%
G 46552
 
4.6%
S 41703
 
4.1%
H 35625
 
3.5%
C 33252
 
3.3%
Other values (16) 236144
23.3%
Decimal Number
ValueCountFrequency (%)
1 498400
58.9%
4 76894
 
9.1%
2 59174
 
7.0%
3 56308
 
6.6%
9 36734
 
4.3%
0 28618
 
3.4%
7 25205
 
3.0%
8 23369
 
2.8%
6 21357
 
2.5%
5 20760
 
2.5%
Connector Punctuation
ValueCountFrequency (%)
_ 337128
100.0%
Other Punctuation
ValueCountFrequency (%)
. 335630
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1519577
60.0%
Latin 1011554
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 181311
17.9%
D 154522
15.3%
L 117047
11.6%
I 61413
 
6.1%
A 54659
 
5.4%
M 49326
 
4.9%
G 46552
 
4.6%
S 41703
 
4.1%
H 35625
 
3.5%
C 33252
 
3.3%
Other values (16) 236144
23.3%
Common
ValueCountFrequency (%)
1 498400
32.8%
_ 337128
22.2%
. 335630
22.1%
4 76894
 
5.1%
2 59174
 
3.9%
3 56308
 
3.7%
9 36734
 
2.4%
0 28618
 
1.9%
7 25205
 
1.7%
8 23369
 
1.5%
Other values (2) 42117
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2531131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 498400
19.7%
_ 337128
13.3%
. 335630
13.3%
N 181311
 
7.2%
D 154522
 
6.1%
L 117047
 
4.6%
4 76894
 
3.0%
I 61413
 
2.4%
2 59174
 
2.3%
3 56308
 
2.2%
Other values (28) 653304
25.8%

level1Name
Text

Missing 

Distinct2070
Distinct (%)0.6%
Missing499035
Missing (%)59.7%
Memory size6.4 MiB
2025-01-08T18:39:54.551005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length9.465842562
Min length3

Characters and Unicode

Total characters3191636
Distinct characters130
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique312 ?
Unique (%)0.1%

Sample

1st rowSumatera Barat
2nd rowKalimantan Barat
3rd rowSumatera Barat
4th rowSumatera Barat
5th rowSulawesi Utara
ValueCountFrequency (%)
zuid-holland 19921
 
4.7%
gelderland 17955
 
4.3%
kalimantan 12145
 
2.9%
sabah 12106
 
2.9%
barat 11903
 
2.8%
noord-holland 10925
 
2.6%
jawa 9106
 
2.2%
sarawak 8546
 
2.0%
timur 8363
 
2.0%
limburg 7831
 
1.9%
Other values (2239) 302728
71.8%
2025-01-08T18:39:54.798637image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 461320
14.5%
n 235409
 
7.4%
e 207213
 
6.5%
r 207212
 
6.5%
l 194779
 
6.1%
o 174582
 
5.5%
i 162392
 
5.1%
d 146654
 
4.6%
u 138738
 
4.3%
t 129029
 
4.0%
Other values (120) 1134308
35.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2571860
80.6%
Uppercase Letter 475768
 
14.9%
Space Separator 84355
 
2.6%
Dash Punctuation 58213
 
1.8%
Other Punctuation 1317
 
< 0.1%
Open Punctuation 59
 
< 0.1%
Close Punctuation 59
 
< 0.1%
Modifier Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 461320
17.9%
n 235409
9.2%
e 207213
8.1%
r 207212
8.1%
l 194779
 
7.6%
o 174582
 
6.8%
i 162392
 
6.3%
d 146654
 
5.7%
u 138738
 
5.4%
t 129029
 
5.0%
Other values (74) 514532
20.0%
Uppercase Letter
ValueCountFrequency (%)
S 60946
12.8%
N 39234
 
8.2%
H 37269
 
7.8%
B 34551
 
7.3%
M 30561
 
6.4%
Z 28371
 
6.0%
T 26791
 
5.6%
G 24375
 
5.1%
O 23090
 
4.9%
K 21541
 
4.5%
Other values (24) 149039
31.3%
Other Punctuation
ValueCountFrequency (%)
, 923
70.1%
' 325
 
24.7%
! 56
 
4.3%
. 7
 
0.5%
/ 6
 
0.5%
Open Punctuation
ValueCountFrequency (%)
[ 58
98.3%
( 1
 
1.7%
Close Punctuation
ValueCountFrequency (%)
] 58
98.3%
) 1
 
1.7%
Space Separator
ValueCountFrequency (%)
84355
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 58213
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3047628
95.5%
Common 144008
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 461320
15.1%
n 235409
 
7.7%
e 207213
 
6.8%
r 207212
 
6.8%
l 194779
 
6.4%
o 174582
 
5.7%
i 162392
 
5.3%
d 146654
 
4.8%
u 138738
 
4.6%
t 129029
 
4.2%
Other values (108) 990300
32.5%
Common
ValueCountFrequency (%)
84355
58.6%
- 58213
40.4%
, 923
 
0.6%
' 325
 
0.2%
[ 58
 
< 0.1%
] 58
 
< 0.1%
! 56
 
< 0.1%
. 7
 
< 0.1%
/ 6
 
< 0.1%
` 5
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3162469
99.1%
None 28570
 
0.9%
Latin Ext Additional 597
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 461320
14.6%
n 235409
 
7.4%
e 207213
 
6.6%
r 207212
 
6.6%
l 194779
 
6.2%
o 174582
 
5.5%
i 162392
 
5.1%
d 146654
 
4.6%
u 138738
 
4.4%
t 129029
 
4.1%
Other values (54) 1105141
34.9%
None
ValueCountFrequency (%)
é 14382
50.3%
â 7294
25.5%
í 1163
 
4.1%
á 1121
 
3.9%
ê 657
 
2.3%
ó 577
 
2.0%
ô 433
 
1.5%
ì 419
 
1.5%
ã 329
 
1.2%
É 283
 
1.0%
Other values (42) 1912
 
6.7%
Latin Ext Additional
ValueCountFrequency (%)
185
31.0%
72
 
12.1%
54
 
9.0%
52
 
8.7%
41
 
6.9%
38
 
6.4%
ế 37
 
6.2%
37
 
6.2%
30
 
5.0%
18
 
3.0%
Other values (4) 33
 
5.5%

level2Gid
Text

Missing 

Distinct8865
Distinct (%)2.7%
Missing501994
Missing (%)60.0%
Memory size6.4 MiB
2025-01-08T18:39:54.997994image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length9.939694508
Min length7

Characters and Unicode

Total characters3321995
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2458 ?
Unique (%)0.7%

Sample

1st rowIDN.30.15_1
2nd rowIDN.12.14_1
3rd rowIDN.30.5_1
4th rowIDN.30.5_1
5th rowIDN.29.10_1
ValueCountFrequency (%)
nld.14.38_1 4144
 
1.2%
cmr.10.3_1 3732
 
1.1%
nld.14.67_2 2667
 
0.8%
civ.1.1_1 2431
 
0.7%
nld.4.44_1 1997
 
0.6%
idn.34.6_1 1970
 
0.6%
nld.14.84_1 1952
 
0.6%
nld.14.2_1 1880
 
0.6%
nld.9.4_1 1814
 
0.5%
png.14.1_1 1673
 
0.5%
Other values (8855) 309955
92.7%
2025-01-08T18:39:55.254744image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 666840
20.1%
1 590323
17.8%
_ 334215
10.1%
N 181156
 
5.5%
D 154398
 
4.6%
2 150376
 
4.5%
4 127623
 
3.8%
3 119036
 
3.6%
L 116827
 
3.5%
I 61334
 
1.8%
Other values (28) 819867
24.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1318401
39.7%
Uppercase Letter 1002539
30.2%
Other Punctuation 666840
20.1%
Connector Punctuation 334215
 
10.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 181156
18.1%
D 154398
15.4%
L 116827
11.7%
I 61334
 
6.1%
A 54491
 
5.4%
M 48964
 
4.9%
G 45690
 
4.6%
S 39463
 
3.9%
H 35575
 
3.5%
C 32901
 
3.3%
Other values (16) 231740
23.1%
Decimal Number
ValueCountFrequency (%)
1 590323
44.8%
2 150376
 
11.4%
4 127623
 
9.7%
3 119036
 
9.0%
6 60673
 
4.6%
7 57260
 
4.3%
9 57181
 
4.3%
5 54603
 
4.1%
8 52509
 
4.0%
0 48817
 
3.7%
Other Punctuation
ValueCountFrequency (%)
. 666840
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 334215
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2319456
69.8%
Latin 1002539
30.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 181156
18.1%
D 154398
15.4%
L 116827
11.7%
I 61334
 
6.1%
A 54491
 
5.4%
M 48964
 
4.9%
G 45690
 
4.6%
S 39463
 
3.9%
H 35575
 
3.5%
C 32901
 
3.3%
Other values (16) 231740
23.1%
Common
ValueCountFrequency (%)
. 666840
28.7%
1 590323
25.5%
_ 334215
14.4%
2 150376
 
6.5%
4 127623
 
5.5%
3 119036
 
5.1%
6 60673
 
2.6%
7 57260
 
2.5%
9 57181
 
2.5%
5 54603
 
2.4%
Other values (2) 101326
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3321995
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 666840
20.1%
1 590323
17.8%
_ 334215
10.1%
N 181156
 
5.5%
D 154398
 
4.6%
2 150376
 
4.5%
4 127623
 
3.8%
3 119036
 
3.6%
L 116827
 
3.5%
I 61334
 
1.8%
Other values (28) 819867
24.7%

level2Name
Text

Missing 

Distinct8635
Distinct (%)2.6%
Missing501999
Missing (%)60.0%
Memory size6.4 MiB
2025-01-08T18:39:55.430031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length9.193734478
Min length1

Characters and Unicode

Total characters3072638
Distinct characters167
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2322 ?
Unique (%)0.7%

Sample

1st rowPesisir Selatan
2nd rowSintang
3rd rowKepulauan Mentawai
4th rowKepulauan Mentawai
5th rowMinahasa Selatan
ValueCountFrequency (%)
leiden 4144
 
0.9%
kota 4103
 
0.9%
de 3968
 
0.9%
océan 3732
 
0.8%
kutai 3500
 
0.8%
timur 3319
 
0.7%
city 2951
 
0.7%
rotterdam 2667
 
0.6%
et 2526
 
0.6%
abidjan 2431
 
0.5%
Other values (8887) 415746
92.6%
2025-01-08T18:39:55.668396image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 382262
 
12.4%
e 266254
 
8.7%
n 240185
 
7.8%
o 187854
 
6.1%
r 167643
 
5.5%
i 166065
 
5.4%
u 144544
 
4.7%
t 118217
 
3.8%
114877
 
3.7%
l 107136
 
3.5%
Other values (157) 1177601
38.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2471835
80.4%
Uppercase Letter 453959
 
14.8%
Space Separator 114877
 
3.7%
Dash Punctuation 23094
 
0.8%
Other Punctuation 4972
 
0.2%
Decimal Number 2213
 
0.1%
Open Punctuation 856
 
< 0.1%
Close Punctuation 675
 
< 0.1%
Math Symbol 88
 
< 0.1%
Modifier Symbol 69
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 382262
15.5%
e 266254
10.8%
n 240185
9.7%
o 187854
 
7.6%
r 167643
 
6.8%
i 166065
 
6.7%
u 144544
 
5.8%
t 118217
 
4.8%
l 107136
 
4.3%
g 97879
 
4.0%
Other values (92) 593796
24.0%
Uppercase Letter
ValueCountFrequency (%)
M 50216
 
11.1%
B 41287
 
9.1%
K 40415
 
8.9%
S 35291
 
7.8%
T 31125
 
6.9%
L 24861
 
5.5%
A 23385
 
5.2%
C 19951
 
4.4%
N 19626
 
4.3%
R 19505
 
4.3%
Other values (32) 148297
32.7%
Decimal Number
ValueCountFrequency (%)
9 798
36.1%
1 487
22.0%
8 328
14.8%
7 209
 
9.4%
0 187
 
8.5%
3 124
 
5.6%
6 30
 
1.4%
2 21
 
0.9%
5 17
 
0.8%
4 12
 
0.5%
Other Punctuation
ValueCountFrequency (%)
' 3095
62.2%
. 1138
 
22.9%
/ 374
 
7.5%
, 254
 
5.1%
& 72
 
1.4%
# 33
 
0.7%
? 6
 
0.1%
Space Separator
ValueCountFrequency (%)
114877
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23094
100.0%
Open Punctuation
ValueCountFrequency (%)
( 856
100.0%
Close Punctuation
ValueCountFrequency (%)
) 675
100.0%
Math Symbol
ValueCountFrequency (%)
+ 88
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 69
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2925794
95.2%
Common 146844
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 382262
 
13.1%
e 266254
 
9.1%
n 240185
 
8.2%
o 187854
 
6.4%
r 167643
 
5.7%
i 166065
 
5.7%
u 144544
 
4.9%
t 118217
 
4.0%
l 107136
 
3.7%
g 97879
 
3.3%
Other values (134) 1047755
35.8%
Common
ValueCountFrequency (%)
114877
78.2%
- 23094
 
15.7%
' 3095
 
2.1%
. 1138
 
0.8%
( 856
 
0.6%
9 798
 
0.5%
) 675
 
0.5%
1 487
 
0.3%
/ 374
 
0.3%
8 328
 
0.2%
Other values (13) 1122
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3044873
99.1%
None 27150
 
0.9%
Latin Ext Additional 606
 
< 0.1%
IPA Ext 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 382262
 
12.6%
e 266254
 
8.7%
n 240185
 
7.9%
o 187854
 
6.2%
r 167643
 
5.5%
i 166065
 
5.5%
u 144544
 
4.7%
t 118217
 
3.9%
114877
 
3.8%
l 107136
 
3.5%
Other values (65) 1149836
37.8%
None
ValueCountFrequency (%)
é 15359
56.6%
è 1744
 
6.4%
â 1202
 
4.4%
É 1058
 
3.9%
ô 1018
 
3.7%
í 1011
 
3.7%
ú 913
 
3.4%
á 886
 
3.3%
ñ 666
 
2.5%
ó 631
 
2.3%
Other values (56) 2662
 
9.8%
Latin Ext Additional
ValueCountFrequency (%)
115
19.0%
93
15.3%
74
12.2%
ế 70
11.6%
53
8.7%
47
7.8%
28
 
4.6%
25
 
4.1%
21
 
3.5%
17
 
2.8%
Other values (15) 63
10.4%
IPA Ext
ValueCountFrequency (%)
ə 9
100.0%

level3Gid
Text

Missing 

Distinct10466
Distinct (%)7.5%
Missing695853
Missing (%)83.2%
Memory size6.4 MiB
2025-01-08T18:39:55.852959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length14
Mean length12.10025222
Min length11

Characters and Unicode

Total characters1698343
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3314 ?
Unique (%)2.4%

Sample

1st rowIDN.30.15.10_1
2nd rowIDN.12.14.4_1
3rd rowIDN.30.5.9_1
4th rowIDN.30.5.9_1
5th rowIDN.29.10.3_1
ValueCountFrequency (%)
civ.1.1.1_1 2431
 
1.7%
civ.14.2.2_1 1100
 
0.8%
cmr.10.3.2_1 1089
 
0.8%
idn.9.16.3_1 919
 
0.7%
idn.34.6.15_1 770
 
0.5%
civ.2.1.2_1 762
 
0.5%
idn.22.5.10_1 758
 
0.5%
tza.13.10.27_1 647
 
0.5%
cmr.10.3.4_1 621
 
0.4%
cmr.10.3.6_1 617
 
0.4%
Other values (10456) 130642
93.1%
2025-01-08T18:39:56.106680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 421068
24.8%
1 301027
17.7%
_ 140356
 
8.3%
2 95511
 
5.6%
3 71465
 
4.2%
I 60394
 
3.6%
N 58352
 
3.4%
D 53842
 
3.2%
4 52782
 
3.1%
5 38084
 
2.2%
Other values (26) 405462
23.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 715957
42.2%
Other Punctuation 421068
24.8%
Uppercase Letter 420962
24.8%
Connector Punctuation 140356
 
8.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 60394
14.3%
N 58352
13.9%
D 53842
12.8%
H 33851
8.0%
T 27392
 
6.5%
C 25011
 
5.9%
A 24690
 
5.9%
M 21474
 
5.1%
R 20395
 
4.8%
E 18212
 
4.3%
Other values (14) 77349
18.4%
Decimal Number
ValueCountFrequency (%)
1 301027
42.0%
2 95511
 
13.3%
3 71465
 
10.0%
4 52782
 
7.4%
5 38084
 
5.3%
6 36571
 
5.1%
9 33035
 
4.6%
0 32171
 
4.5%
7 28483
 
4.0%
8 26828
 
3.7%
Other Punctuation
ValueCountFrequency (%)
. 421068
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 140356
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1277381
75.2%
Latin 420962
 
24.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 60394
14.3%
N 58352
13.9%
D 53842
12.8%
H 33851
8.0%
T 27392
 
6.5%
C 25011
 
5.9%
A 24690
 
5.9%
M 21474
 
5.1%
R 20395
 
4.8%
E 18212
 
4.3%
Other values (14) 77349
18.4%
Common
ValueCountFrequency (%)
. 421068
33.0%
1 301027
23.6%
_ 140356
 
11.0%
2 95511
 
7.5%
3 71465
 
5.6%
4 52782
 
4.1%
5 38084
 
3.0%
6 36571
 
2.9%
9 33035
 
2.6%
0 32171
 
2.5%
Other values (2) 55311
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1698343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 421068
24.8%
1 301027
17.7%
_ 140356
 
8.3%
2 95511
 
5.6%
3 71465
 
4.2%
I 60394
 
3.6%
N 58352
 
3.4%
D 53842
 
3.2%
4 52782
 
3.1%
5 38084
 
2.2%
Other values (26) 405462
23.9%

level3Name
Text

Missing 

Distinct9920
Distinct (%)7.2%
Missing697659
Missing (%)83.4%
Memory size6.4 MiB
2025-01-08T18:39:56.292968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length8.747542403
Min length2

Characters and Unicode

Total characters1211972
Distinct characters142
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3029 ?
Unique (%)2.2%

Sample

1st rowPancung Soal
2nd rowKayan Hilir
3rd rowSipora Selatan
4th rowSipora Selatan
5th rowAmurang
ValueCountFrequency (%)
selatan 2482
 
1.3%
abidjan 2431
 
1.2%
tengah 2360
 
1.2%
utara 1834
 
0.9%
ban 1662
 
0.8%
n.a 1613
 
0.8%
barat 1608
 
0.8%
mae 1569
 
0.8%
1 1529
 
0.8%
na 1448
 
0.7%
Other values (9982) 177214
90.5%
2025-01-08T18:39:56.542492image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 199433
16.5%
n 96033
 
7.9%
o 74905
 
6.2%
i 66728
 
5.5%
e 61766
 
5.1%
u 60137
 
5.0%
57200
 
4.7%
r 54108
 
4.5%
g 41770
 
3.4%
l 37532
 
3.1%
Other values (132) 462360
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 938705
77.5%
Uppercase Letter 192847
 
15.9%
Space Separator 57200
 
4.7%
Decimal Number 9632
 
0.8%
Other Punctuation 5353
 
0.4%
Dash Punctuation 3729
 
0.3%
Open Punctuation 2281
 
0.2%
Close Punctuation 2225
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 199433
21.2%
n 96033
10.2%
o 74905
 
8.0%
i 66728
 
7.1%
e 61766
 
6.6%
u 60137
 
6.4%
r 54108
 
5.8%
g 41770
 
4.4%
l 37532
 
4.0%
t 32903
 
3.5%
Other values (74) 213390
22.7%
Uppercase Letter
ValueCountFrequency (%)
S 23353
12.1%
B 20810
10.8%
T 18280
9.5%
M 17566
9.1%
K 15980
 
8.3%
P 12777
 
6.6%
A 12574
 
6.5%
N 9940
 
5.2%
C 9519
 
4.9%
L 9004
 
4.7%
Other values (25) 43044
22.3%
Decimal Number
ValueCountFrequency (%)
1 3088
32.1%
2 2093
21.7%
4 1397
14.5%
6 789
 
8.2%
3 732
 
7.6%
0 397
 
4.1%
5 358
 
3.7%
8 292
 
3.0%
9 288
 
3.0%
7 198
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 3464
64.7%
' 1197
 
22.4%
/ 563
 
10.5%
, 106
 
2.0%
! 14
 
0.3%
\ 4
 
0.1%
: 3
 
0.1%
* 1
 
< 0.1%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
57200
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3729
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2281
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2225
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1131552
93.4%
Common 80420
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 199433
17.6%
n 96033
 
8.5%
o 74905
 
6.6%
i 66728
 
5.9%
e 61766
 
5.5%
u 60137
 
5.3%
r 54108
 
4.8%
g 41770
 
3.7%
l 37532
 
3.3%
t 32903
 
2.9%
Other values (109) 406237
35.9%
Common
ValueCountFrequency (%)
57200
71.1%
- 3729
 
4.6%
. 3464
 
4.3%
1 3088
 
3.8%
( 2281
 
2.8%
) 2225
 
2.8%
2 2093
 
2.6%
4 1397
 
1.7%
' 1197
 
1.5%
6 789
 
1.0%
Other values (13) 2957
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1200460
99.1%
None 10685
 
0.9%
Latin Ext Additional 827
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 199433
16.6%
n 96033
 
8.0%
o 74905
 
6.2%
i 66728
 
5.6%
e 61766
 
5.1%
u 60137
 
5.0%
57200
 
4.8%
r 54108
 
4.5%
g 41770
 
3.5%
l 37532
 
3.1%
Other values (65) 450848
37.6%
None
ValueCountFrequency (%)
é 6399
59.9%
è 1009
 
9.4%
ơ 414
 
3.9%
ư 410
 
3.8%
ú 369
 
3.5%
ï 312
 
2.9%
ñ 261
 
2.4%
á 198
 
1.9%
í 170
 
1.6%
ê 155
 
1.5%
Other values (34) 988
 
9.2%
Latin Ext Additional
ValueCountFrequency (%)
143
17.3%
111
13.4%
88
10.6%
77
9.3%
55
 
6.7%
ế 55
 
6.7%
41
 
5.0%
40
 
4.8%
33
 
4.0%
32
 
3.9%
Other values (13) 152
18.4%

iucnRedListCategory
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing73123
Missing (%)8.7%
Memory size6.4 MiB
2025-01-08T18:39:56.600492image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1526172
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNE
2nd rowCR
3rd rowNE
4th rowEN
5th rowNE
ValueCountFrequency (%)
ne 566671
74.3%
lc 172272
 
22.6%
vu 7992
 
1.0%
nt 6121
 
0.8%
en 4663
 
0.6%
dd 3658
 
0.5%
cr 1620
 
0.2%
ew 45
 
< 0.1%
ex 44
 
< 0.1%
2025-01-08T18:39:56.791483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 577455
37.8%
E 571423
37.4%
C 173892
 
11.4%
L 172272
 
11.3%
V 7992
 
0.5%
U 7992
 
0.5%
D 7316
 
0.5%
T 6121
 
0.4%
R 1620
 
0.1%
W 45
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1526172
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 577455
37.8%
E 571423
37.4%
C 173892
 
11.4%
L 172272
 
11.3%
V 7992
 
0.5%
U 7992
 
0.5%
D 7316
 
0.5%
T 6121
 
0.4%
R 1620
 
0.1%
W 45
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1526172
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 577455
37.8%
E 571423
37.4%
C 173892
 
11.4%
L 172272
 
11.3%
V 7992
 
0.5%
U 7992
 
0.5%
D 7316
 
0.5%
T 6121
 
0.4%
R 1620
 
0.1%
W 45
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1526172
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 577455
37.8%
E 571423
37.4%
C 173892
 
11.4%
L 172272
 
11.3%
V 7992
 
0.5%
U 7992
 
0.5%
D 7316
 
0.5%
T 6121
 
0.4%
R 1620
 
0.1%
W 45
 
< 0.1%